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NUCLEIC ACIDS AND PROTEINS FROM STREPTOCOCCUS Pl^JEUMONiAE 

The present invention relates to proteins derived from Sirepiococciis pneumoniae, 
nucleic acid molecules encoding such proteins, the use of the nucleic acid and/or 
proteins as antigens/immunogens and in detection/diagnosis » as well as methods for 
5 screening the proteins/nucleic acid sequences as potential anti-microbial targets. 

Streptococcus pneumoniae, commonly referred to as the pneumococcus, is an 
important pathogenic organism. The continuing significance of Streptoccocus 
pneumoniae infections in relation to human disease in developing and developed 

10 countries has been authoritatively reviewed (Fiber, G.R., Science, 265: 1385-1387 
(1994)). That indicates that on a global scale this organism is believed to be the 
most common bacterial cause of acute respiratory infections, and is estimated to 
result in 1 million childhood deaths each year, mostly in developing countries 
(Stansfield, S,K., Pediatr, Infect, Dis,, 6: 622 (1987)). In the USA it has been 

15 suggested (Breiman et al. Arch. Intern. Med., 150; 1401 (1990)) that the 
pneumococcus is still the most common cause of bacterial pneumonia, and that 
disease rates are particularly high in young children, in the elderly, and in patients 
with predisposing conditions such as asplenia, heart, lung and kidney disease, 
diabetes, alcoholism, or with immunosupressive disorders, especially AIDS. These 

20 groups are at higher risk of pneumococcal septicaemia and hence meningitis and 
therefore have a greater risk of dying from pneumococcal infection. The 
pneumococcus is also the leading cause of otitis media and sinusitis, which remain 
prevalent infections in children in developed countries, and which incur substantial 
costs, 

25 

The need for effective preventative strategies against pneumococcal infection is 
highlighted by the recent emergence of penicillin-resistant pneumococci. It has been 
reported that 6.6% of pneumoccal isolates in 13 US hospitals in 12 states were found 
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to be resistant to penicillin and some isolates were also resistant to other antibiotics 
including third generation cyclosporins (Schappert, S.M., Vital and Health Statistics 
of the Centres for Disease Control/National Centre for Health Statistics, 214:1 
(1992)). The rates of penicillin resistance can be higher (up to 20%) in some 
5 hospitals (Breiman et al, J. Am. Med. Assoc, 271: 1831 (1994)). Since the 
development of penicillin resistance among pneumococci is both recent and sudden, 
coming after decades during which penicillin remained an effective treatment, these 
findings are regarded as alarming. 

10 For the reasons given above, there are therefore compelling grounds for considering 
improvements in the means of preventing, controlling, diagnosing or treating 
pneumococcal diseases. 

Various approaches have been taken in order to provide vaccines for the prevention 
15 of pneumococcal infections. Difficulties arise for instance in view of the variety of 
serotypes (at least 90) based on the strucmre of the polysaccharide capsule 
surrounding the organism. Vaccines against individual serotypes are not effective 
against other serotypes and this means that vaccines must include polysaccharide 
antigens from a whole range of serotypes in order to be effective in a majority of 
20 cases. An additional problem arises because it has been found that the capsular 

polysaccharides (each of which determines the serotype and is the major protective 
antigen) when purified and used as a vaccine do not reliably induce protective 
antibody responses in children under two years of age, the age group which suffers 
the highest incidence of invasive pneumococcal infection and meningitis, 

25 

A modification of the approach using capsule antigens relies on conjugating the 
polysaccharide to a protein in order to derive an enhanced immune response, 
particularly by giving the response T-cell dependent character. This approach has 
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been used in the development of a vaccine against Haemophilus influenzae, for 
instance. There are, however, issues of cost concerning both the multi- 
polysaccharide vaccines and those based on conjugates, 

5 A third approach is to look for other antigenic components which offer the potential 
to be vaccine candidates. This is the basis of the present invention. Using a specially 
developed bacterial expression system, we have been able to identify a group of 
protein antigens from pneomococcus which are associated with the bacterial 
envelope or which are secreted, 

10 

Thus, in a first aspect the present invention provides a Streptococcus pneumoniae 
protein or polypeptide having a sequence selected from those shown in table 1 . 

In a second aspect, the present invention provides a Streptococcus pneumoniae 
15 protein or polypeptide having a sequence selected from those shown in table 2. 

A protein or polypeptide of the present mvention may be provided in substantially pure 
form. For example, it may be provided in a form which is substantially free of other 
proteins. 

20 

As discussed herein, the proteins and polypeptides of the invention are useful as 
antigenic material. Such material can be "antigenic" and/or "immunogenic". 
Generally, "antigenic** is taken to mean that the protein or polypeptide is capable of 
being used to raise antibodies or indeed is capable of inducing an antibody response in 
25 a subject. "Immunogenic** is taken to mean that the protein or polypeptide is capable of 
eliciting a protective immune response in a subject. Thus, in the latter case, the protein 
or polypeptide may be capable of not only generating an antibody response but, in 
addition, a non-antibody based inraiime response. 
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The skilled person will appreciate that homologues or derivatives of the proteins or 
polypeptides of the invention will also find use in the context of the present invention, 
ie as antigenic/immunogenic material. Thus, for instance proteins or polypeptides 
5 which include one or more additions, deletions, substitutions or the like are 
encompassed by the present invention. In addition, it may be possible to replace one 
amino acid with another of similar "type**. For instance replacing one hydrophobic 
amino acid with another. 

One can use a program such as the CLUSTAL program to compare amino acid 
10 sequences. This program compares amino acid sequences and finds the optimal 
alignment by inserting spaces in either sequence as appropriate. It is possible to 
calculate amino acid identity or similarity (identity plus conservation of amino acid 
type) for an optimal alignment. A program like BLASTx will align the longest stretch 
of similar sequences and assign a value to the fit. It is thus possible to obtain a 
15 comparison where several regions of similarity are found, each having a different 
score. Both types of identity analysis are contemplated in the present invention. 

In the case of homologues and derivatives, the degree of identity with a protein or 
polypeptide as described herein is less important than that the homologue or derivative 

20 should retain the antigenicity or immunogenicity of the original protein or polypeptide. 
However, suitably, homologues or derivatives having at least 60% similarity (as 
discussed above) with the proteins or polypeptides described herein are provided. 
Preferably, homologues or derivatives having at least 70% similarity, more preferably 
at least 80% sinailarity are provided. Most preferably, homologues or derivatives 

25 having at least 90% or even 95% similarity are provided. 

In an alternative approach, the homologues or derivatives could be fusion proteins, 
incorporating moieties which render purification easier, for example by effectively 
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tagging the desired protein or polypeptide. It may be necessary to remove the "tag" or 
it may be the case that the fusion protein itself retains sufficient antigenicity to be 
useful. 

5 In an additional aspect of the invention there are provided antigenic/immunogenic 
fragments of the proteins or polypeptides of the invention, or of homologues or 
derivatives thereof. 

For fragments of the proteins or polypeptides described herein, or of homologues or 
10 derivatives thereof, the situation is slightly different. It is well known that is possible to 
screen an antigenic protein or polypeptide to identify epitopic regions, ie those regions 
which are responsible for the protein or polypeptide's antigenicity or immunogenicity. 
Methods for carrying out such screening are well known in the art. Thus, the fragments 
of the present invention should include one or more such epitopic regions or be 
15 sufficiently similar to such regions to retain their antigenic/immunogenic properties. 
Thus, for fragments according to the present invention the degree of identity is perhaps 
irrelevant, since they may be 100% identical to a particular part of a protem or 
polypeptide, homologue or derivative as described herein. The key issue, once again, is 
that the fragment retains the antigenic/immunogenic properties, 

20 

Thus, what is important for homologues, derivatives and fragments is that they possess 
at least a degree of the antigenicity/inununogenicity of the protein or polypeptide from 
which they are derived. 

25 Gene cloning techniques may be used to provide a protein of the invention in 

substantially pure form. These techniques are disclosed, for example, in J, Sambrook 
et al Molecular Cloning 2nd Edition, Cold Spring Harbor Laboratory Press (1989). 
Thus, in a third aspect, the present invention provides a nucleic acid molecule 
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comprising or consisting of a sequence which is: 

(i) any of the DNA sequences set out in Table 1 or their RNA equivalents; 

(ii) a sequence which is complementary to any of the sequences of (i); 

(iii) a sequence which codes for the same protein or polypeptide, as those 
sequences of (i) or (ii); 

(iv) a sequence which has substantial identity with any of those of (i), (ii) 
and (iii); 

(v) a sequence which codes for a homologue, derivative or fragment of a 
protein as defined in Table 1 . 

In a fourth aspect the present invention provides a nucleic acid molecule comprising or 
consisting of a sequence which is: 

(i) any of the DNA sequences set out in Table 2 or their RNA equivalents; 

(ii) a sequence which is complementary to any of the sequences of (i); 

(iii) a sequence which codes for the same protein or polypeptide, as those 
sequences of (i) or (ii); 

(iv) a sequence which has substantial identity with any of those of (i), (ii) 
and (iii); or 
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(v) a sequence which codes for a hoiBologue, derivative or fragment of a 
protein as defined in Table 2. 

The nucleic acid molecules of the invention may include a plurality of such sequences, 
5 and/or fragments. The skilled person will appreciate that the present invention can 
include novel variants of those particular novel nucleic acid molecules which are 
exemplified herein. Such variants are encompassed by the present invention. These 
may occur in nature, for example because of strain variation. For example, additions, 
substitutions and/or deletions are included. In addition, and particularly when utilising 
10 microbial expression systems, one may wish to eiigineer the nucleic acid sequence by 
making use of known preferred codon usage in the particular organism being used for 
expression. Thus, synthetic or non-naturally occurring variants are also included within 
the scope of the invention. 

15 The term "RNA equivalent" when used above indicates that a given RNA molecule has 
a sequence which is complementary to diat of a given DNA molecule (allowing for tiie 
fact that in RNA "U" replaces "T" in the genetic code). 

When comparing nucleic acid sequences for the purposes of determining the degree of 
20 homology or identity one can use programs such as BESTFIT and GAP (both from the 
Wisconsin Genetics Computer Group (GCG) software package) BESTFIT, for 
example, compares two sequences and produces an optimal alignment of the most 
similar segments. GAP enables sequences to be aligned along their whole length and 
finds the optimal alignment by inserting spaces in either sequence as appropriate. 
25 Suitably, in the context of the present invention when discussing identity of nucleic acid 
sequences, the comparison is made by alignment of the sequences along their whole 
length. 
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Preferably, sequences which have substantial identity have at least 50% sequence 
identity, desirably at least 75% sequence identity and more desirably at least 90 or at 
least 95% sequence identity with said sequences. In some cases the sequence identity 
may be 99% or above, 

5 

Desirably, the term "substantial identity'* indicates that said sequence has a greater 
degree of identity with any of the sequences described herein than with prior art nucleic 
acid sequences. 

10 It should however be noted that where a nucleic acid sequence of the present invention 
codes for at least pan of a novel gene product the present invention includes within its 
scope all possible sequence coding for the gene product or for a novel part thereof. 

The nucleic acid molecule may be in isolated or recombinant form. It may be 
15 incorporated into a vector and the vector may be incorporated into a host. Such vectors 
and suitable hosts form yet further aspects of the present invention. 

Therefore, for example, by using probes based upon the nucleic acid sequences 
provided herein, genes in Streptococcus pneumoniae can be identified. They can then 
20 be excised using restriction enzymes and cloned mto a vector. The vector can be 
introduced into a suitable host for expression. 

Nucleic acid molecules of the present invention may be obtained from pneumoniae by 
the use of appropriate probes complementary to part of the sequences of the nucleic 
25 acid molecules. Restriction enzymes or sonication techniques can be used to obtain 
appropriately sized fragments for probing. 



9 



Alternatively PGR techniques may be used to amplify a desired nucleic acid sequence. 
Thus the sequence data provided herein can be used to design two primers for use in 
PGR so that a desired sequence, including whole genes or fragments thereof, can be 
targeted and then amplified to a high degree, 

5 

Typically primers will be at least 15-25 nucleotides long. 

As a further alternative chemical synthesis may be used. This may be automated. 
Relatively short sequences may be chemically synthesised and ligated together to 
10 provide a longer sequence. 

There is another group of proteins from S, pneumoniae which have been identified 
using the bacterial expression system described herein. These are known proteins 
from S. pneumoniae, which have not previously been identified as antigenic proteins. 

15 The amino acid sequences of this group of proteins, together with DNA sequences 
coding for them are shown in Table 3, These proteins, or homologues, derivatives 
and/or fragments thereof also find use as antigens/immunogens. Thus, in another 
aspect the present invention provides the use of a protein or polypeptide having a 
sequence selected from those shown in Tables 1-3, or homologues, derivatives 

20 and/or fragments thereof, as an immunogen/antigen. 

In yet a further aspect the present invention provides an immunogenic/antigenic 
composition comprising one or more proteins or polypeptides selected from those 
whose sequences are shown in Tables 1-3, or homologues or derivatives thereof, 
25 and/or fragments of any of these. In preferred embodiments, the 

immunogenic/ antigenic composition is a vaccine or is for use in a diagnostic assay. 

In the case of vaccines suitable additional excipients, diluents, adjuvants or the like 
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inay be included. Numerous examples of these are well known in the art. 

It is also possible to utilise the nucleic acid sequences shown in Tables 1-3 in the 
preparation of so-called DNA vaccines. Thus, the invention also provides a vaccine 
5 composition comprising one or more nucleic acid sequences as defined herein. DNA 
vaccines are described in the art (see for instance, Donnelly et al , Ann. Rev, 
Immunol, 15:617-648 (1997)) and the skilled person can use such art described 
techniques to produce and use DNA vaccines according to the present invention. 

10 As already discussed herein the proteins or polypeptides described herein, their 

homologues or derivatives, and/or fragments of any of these, can be used in methods 
of detecting/diagnosing S.pneumoniae. Such methods can be based on the detection 
of antibodies against such proteins which may be present in a subject. Therefore the 
present invention provides a method for the detection/diagnosis of S.pneumoniae 

15 which comprises the step of bringing into contact a sample to be tested with at least 
one protein, or homologue, derivative or fragment thereof, as described herein. 
Suitably, the sample is a biological sample, such as a tissue sample or a sample of 
blood or saliva obtained from a subject to be tested. 

20 In an alternative approach, the proteins described herein, or homologues, derivatives 
and/or fragments thereof, can be used lo raise antibodies, which in turn can be used 
to detect the antigens, and hence S, pneumoniae. Such antibodies form another aspect 
of the invention. Antibodies within the scope of the present invention may be 
monoclonal or polyclonal. 

25 

Polyclonal antibodies can be raised by stimulating their production in a suitable animal 
host (e.g. a mouse, rat, guinea pig, rabbit, sheep, goat or monkey) when a protein as 
described herein, or a homologue, derivative or fragment thereof, is injected into the 
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animal. If desired, an adjuvant may be administered together with the protein. Well- 
known adjuvants include Freund's adjuvant (complete and incomplete) and aluminium 
hydroxide. The andbodies can then be purified by virtue of their binding to a protein as 
described herein. 

5 

Monoclonal antibodies can be produced from hybridomas. These can be formed by 
fusing myeloma cells and spleen cells which produce the desired antibody in order to 
form an immortal cell line. Thus the well-known Kohler & Milstein technique {Nature 
256 (1975)) or subsequent variations upon this technique can be used. 

10 

Techniques for producing monoclonal and polyclonal antibodies that bind to a 
particular polypepiide/protein are now well developed in the art. They are discussed in 
standard immunology textbooks, for example in Roitt et aU Immunology second edition 
(1989), Churchill Livingstone, London. 

15 

In addition to whole antibodies, the present invention includes derivatives thereof which 
are capable of binding to proteins etc as described herein. Thus the present invention 
includes antibody fragments and synthetic constructs. Examples of antibody fragments 
and synthetic constructs are given by Dougall et al in Tibtech 12 372-379 (September 
20 1994). 

Antibody fragments include, for example. Fab, F(ab')2 and Fv fragments. Fab 
fragments (These are discussed in Roitt et al [supra] ). Fv fragments can be modified 
to produce a synthetic construct known as a single chain Fv (scFv) molecule. This 
25 includes a peptide linker covalently joining V^^ and regions, which contributes to the 
stability of the molecule. Other synthetic constructs that can be used include CDR 
peptides. These are synthetic peptides comprising antigen-binding determinants. 
Peptide mimetics may also be used. These molecules are usually conformationally 



12 



restricted organic rings that mimic the structure of a CDR loop and that include 
antigen-interactive side chains. 

Synthetic constructs include chimaeric molecules. Thus, for example, humanised (or 
5 primatised) antibodies or derivatives thereof are within the scope of the present 
invention. An example of a humanised antibody is an antibody having human 
framework regions, but rodent hypervariable regions. Ways of producing chimaeric 
antibodies are discussed for example by Morrison et al in PNAS, 81, 6851-6855 (1984) 
and by Takeda et al in Nature. 314, 452-454 (1985), 

10 

Synthetic constructs also include molecules comprising an additional moiety that 
provides the molecule with some desirable property in addition to antigen binding. For 
example the moiety may be a label (e.g. a fluorescent or radioactive label). 
Alternatively, it may be a pharmaceutically active agent. 

15 

Antibodies, or derivatives thereof, find use in detection/diagnosis of S.pnewnoniae. 
Thus, in another aspect the present invention provides a method for the 
detectionydiagnosis of S.pneumoniae which comprises the step of bringing into contact 
a sample to be tested and antibodies capable of binding to one or more protems 
20 described herein, or to homologues, derivatives and/or fragments thereof. 

In addition, so-called "Affibodies" may be utilised. These are binding proteins 
selected from combinatorial libraries of an alpha-helical bacterial receptor dom^ 
(Nord €t al y) Thus, Small protein domains, capable of specific binding to different 
25 target proteins can be selected using combinatorial approaches. 

It will also be clear that the nucleic acid sequences described herein may be used to 
detect/diagnose S. pneumoniae. Thus, in yet a further aspect, the present invention 
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provides a method for the detection/diagnosis of S.pneumoniae which comprises the 
step of bringing into contact a sample to be tested with at least one nucleic acid 
sequence as described herein. Suitably, the sample is a biological sample, such as a 
tissue sample or a sample of blood or saliva obtained from a subject to be tested. 
5 Such samples may be pre-treated before being used in the methods of the invention. 
Thus, for example, a sample may be treated to extract DNA. Then, DNA probes 
based on the nucleic acid sequences described herein (ie usually fragments of such 
sequences) may be used to detect nucleic acid from S.pneumoniae. 

10 In additional aspects, the present invention provides: 

(a) a method of vaccinating a subject against S.pneumoniae which comprises the 
step of administering to a subject a protein or polypeptide of the invention, or a 
derivative, homologue or fragment thereof, or an immunogenic composition of the 

15 invention; 

(b) a method of vaccinating a subject against S.pneumoniae which comprises the 
step of administering to a subject a nucleic acid molecule as defined herein; 

20 (c) a method for the prophylaxis or treatment of S.pneumoniae infection which 
comprises the step of administering to a subject a protein or polypeptide of the 
invention, or a derivative, homologue or fragment thereof, or an immunogenic 
composition of the invention; 

25 (d) a method for the prophylaxis or treatment of S.pneumoniae infection which 
comprises the step of administering to a subject a nucleic acid molecule as defined 
herein; 
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(e) a kit for use in detecting/diagnosing S.pneumoniae infection comprising one 
or more proteins or polypeptides of the invention, or homologues, derivatives or 
fragments thereof, or an antigenic composition of the invention; and 

5 (f) ■ a kit for use in detecting/diagnosing S.pneumoniae infection comprising one 
or more nucleic acid molecules as defined herein. 

Given that we have identified a group of important proteins, such proteins are 
potential targets for anti-microbial therapy. It is necessary, however, to determine 
10 whether each individual protein is essential for the organism's viability. Thus, the 
present invention also provides a method of determining whether a protein or 
polypeptide as described herein represents a potential anti-microbial target which 
comprises antagonising, inhibiting or otherwise interfering with the function or 
expression of said protein and determining whether S,pneumoniae is still viable. 

15 

A suitable method for inactivating the protein is to effect selected gene knockouts, ie 
prevent expression of the protein and determine whether this results in a lethal 
change. Suitable methods for carrying out such gene knockouts are described in Li 
et al , P.N,A,S.. 94:13251-13256 (1997) and Kolkman et al , 178:3736- 
20 3741 (1996). 

In a final aspect the present invention provides the use of an agent capable of 
antagonising, inhibiting or otherwise interfering with the function or expression of a 
protein or polypeptide of the invention in the manufacmre of a medicament for use in 
25 the treatment or prophylaxis of S.pneumoniae infection. 

As mentioned above, we have used a bacterial expression system as a means of 
identifying those proteins which are surface associated, secreted or exported and 
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thus, would find use as antigens. 

The information necessary for the secretion/export of proteins has been extensively 
studied in bacteria. In the majority of cases » protein export requires a signal peptide 

5 to be present at the N-terminus of the precursor protein so that it becomes directed to 
the translocation machinery on the cytoplasmic membrane. During or after 
translocation, the signal peptide is removed by a membrane associated signal 
peptidase. Ultimately the localization of the protein (i.e. whether it be secreted, an 
integral membrane protein or attached to the cell wall) is determined by sequences 

10 other than the leader peptide itself. 

We are specifically interested in surface located or exported proteins as these are 
likely to be antigens for use in vaccines, as diagnostic reagents or as targets for 
therapy with novel chemical entities. We have therefore developed a screening 

15 vector-system in Lactococcus lactis that permits genes encoding exported proteins to 
be identified and isolated. We provide below a representative example showing how 
given novel surface associated proteins from Streptococcus pneumoniae have been 
identified and characterized. The screening vector incorporates the staphylococcal 
nuclease gene nuc lacking its own export signal as a secretion reporter. 

20 Staphylococcal nuclease is a naturally secreted heat-stable, monomeric enzyme 
which has been efficiently expressed and secreted in a range of Gram positive 
bacteria (Shortle, Gene, 22:181-189 (1983); Kovacevic etal, /. BacterioL, 
162:521-528 (1985); Miller et aL, J, BaaerioL, 169:3508-3514 (1987); Liebl etal. 
J, BacterioL, 174:1854-1861 (1992); Le Loir etal, J. BacterioL. 176:5135-5139 

25 (1994); Poquet et al, J. Bactenol, 180:1904-1912 (1998)). 

Recently, Poquet et ai ((1998), supra) have described a screening vector 
incorporating the nuc gene lacking its own signal leader as a reporter to identify 
exported proteins in Gram positive bacteria, and have applied it to L. laais. This 
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vector (pFUN) contains the pAMpi replicon which functions in a broad host range 
of Gram-positive bacteria in addition to the CoIEl replicon that promotes replication 
in Escherichia coli and certain other Gram negative bacteria. Unique cloning sites 
present in the vector can be used to generate transcriptional and translational fusions 
5 betv^een cloned genomic DNA fragments and the open reading frame of the 

truncated nuc gene devoid of its own signal secretion leader. The nuc gene makes an 
ideal reporter gene because the secretion of nuclease can readily be detected using a 
simple and sensitive plate test: Recombinant colonies secreting the nuclease develop 
a pink halo whereas control colonies remain white (Shortle, (1983), supra; Le Loir 
10 era/., (1994), supra). 

Thus, the invention will now be described with reference to the following 
representative example, which provides details of how the proteins, polypeptides and 
nucleic acid sequences described herein identified as antigenic targets. 

15 

We describe herein the construction of three reporter vectors and their use in L. 
laais to identify and isolate genomic DNA fragments from Streptococcus 
pneumoniae encoding secreted or surface associated proteins. 

The invention will now be described with reference to the examples, which should 
20 not be construed as in any way limiting the invention. The examples refer to the 
figures in which: 

Figure 1: shows the results of a number of DNA vaccine trials; and 
25 Figure 2: shows the results of further DNA vaccine trials. 

EXAMPLE 1 

(i) Construction of the pTREPl-nuc series of reporter vectors 
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(a\ Construction of expression plasmid pTREPl 

The pTREPI plasmid is a high-copy number (40-80 per cell) theta-replicating gram 
positive plasmid, which is a derivative of the pTREX plasmid which is itself a 

5 derivative of the previously published pIL253 plasmid. pIL253 incorporates the 

broad Gram-positive host range replicon of pAM(51 (Simon and Chopin, Biochimie, 
70:559-567 (1988)) and is non-mobilisable by the L ktais sex-factor. pIL253 also 
lacks the tra function which is necessary for transfer or efficient mobilisarion by 
conjugative parent plasmids exemplified by pIL501. The Enterococcal pAMpi 

10 replicon has previously been transferred to various species including Streptococcus, 
Lactobacillus and Bacillus species as well as Clostridium acetobutylicum, (Oultram 
and klaenhammer, FEMS Microbiological Letters, 27:129-134 (1985); Gibson er 
aL, (1979); LeBlanc et aL, Proceedings of the National Academy of Science USA, 
75:3484-3487 (1978)) indicating the potential broad host range utility. The pTREPI 

15 plasmid represents a constitutive transcription vector. 

The pTREX vector was constructed as follows. An artificial DNA fragment 
containing a putative RNA stabilising sequence, a translation initiation region (TIR), 
a multiple cloning site for insertion of the target genes and a transcription terminator 

20 was created by aimealing 2 complementary oligonucleotides and extending with Tfl 
DNA polymerase. The sense and anti-sense oligonucleotides contained the 
recognition sites for Nhel and BamHI at their 5' ends respectively to facilitate 
cloning. This fragment was cloned between the Xbal and BamHI sites in 
pUCl9NT7, a derivative of pUC19 which contains the T7 expression cassette from 

25 pLETl (Wells et al , 7. AppL BacterioL , 74:629-636 (1993)) cloned between the 
EcoRI and Hindlll sites. The resulting construct was designated pUCLEX. The 
complete expression cassette of pUCLEX was then removed by cutting with Hindlll 
and blunting followed by cutting with EcoRI before cloning into EcoRI and Sad 
(blunted) sites of pIL253 to generate the vector pTREX (Wells and Schofield, In 
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Current advances in metabolism, genetics and applications-NATO AS! Series, H 
98:37-62 (1996)). The putative RNA stabilising sequence and TIR are derived from 
the Escherichia coli 11 bacteriophage sequence and modified at one nucleotide 
position to enhance the complementarity of the Shine Dalgamo (SD) motif to the 
5 ribosomal 16s RNA of Lactococcus laais (Schofield et al, pers. corns. University of 
Cambridge Dept. Pathology.). 

A Lactococcus lactis MG1363 chromosomal DNA fragment exhibiting promoter 
activity which was subsequently designated P7 was cloned between the EcoRI and 
10 Bgin sites present in the expression cassette, creating pTREXT. This active 
promoter region had been previously isolated using the promoter probe vector 
pSB292 (Waterfield et al, Gene, 165:9-15 (1995)). The promoter fragment was 
amplified by PCR using the Vent DNA polymerase according to the manufacturer. 

15 The pTREPl vector was then constructed as follows. An artificial DNA fragment 
which included a transcription terminator, the forward pUC sequencing primer, a 
promoter multiple -cloning site region and a universal translation stop sequence was 
created by annealing two overlapping partially complementary synthetic 
oligonucleotides together and extending with sequenase according to manufacturers 

20 instructions. The sense and anti-sense (pTREPp and pTREPR) oligonucleotides 

contained the recognition sites for EcoRV and BamHI at their 5' ends respectively to 
facilitate cloning into pTREXT. The transcription terminator was that of the Bacillus 
penicillinase gene, which has been shown to be effective in Lactococcus (Jos et aL, 
Applied and Environmental Microbiology, 50:540-542 (1985)). This was considered 

25 necessary as expression of target genes in the pTREX vectors was observed to be 
leaky and is thought to be the result of cryptic promoter activity in the origin region 
(Schofield et al. pers. corns. University of Cambridge Dept. Pathology.). The 
forward pUC primer sequencing was included to enable direct sequencing of cloned 
DNA fragments. The translation stop sequence which encodes a stop codon in 3 
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different frames was included to prevent translational fusions between vector genes 
and cloned DNA fragments. The pTREX7 vector was first digested with EcoRI and 
blunted using the 5* - 3' polymerase activity of T4 DNA polymerase (NEB) 
according to manufacturer's instructions. The EcoRI digested and blunt ended 
5 pTREX7 vector was then digested with Bgl II thus removing the P7 promoter. The 
artificial DNA fragment derived from the annealed synthetic oligonucleotides was 
then digested with EcoRV and Bam HI and cloned into the EcoRI(blunted)-Bgl II 
digested pTREX? vector to generate pTREP. A Lactococcus lactis MG1363 
chromosomal promoter designated PI was dien cloned between the EcoRI and Bglll 

10 sites present in the pTREP expression cassette fonriing pTREPl. This promoter was 
also isolated using the promoter probe vector pSB292 and characterised by 
Waterfield ei aL, (1995), supra. The PI promoter fragment was originally 
amplifieci by PGR using vent DNA polymerase according to manufacturers 
instructions and cloned into the pTREX as an EcoRI-Bglll DNA fragment. The 

15 EcoRI-Bglll PI promoter containing fragment was removed from pTREXl by 

restriction enzyme digestion and used for cloning into pTREP (Schofield et aL pers. 
corns. University of Cambridge, Dept. Pathology.). 

(b) PGR amplification of the 5. aureus nuc gene . 

20 

The nucleotide sequence of the S. aureus nuc gene (EMBL database accession 
number V01281) was used to design synthetic oligonucleotide primers for PGR 
amplification. The primers were designed to amplify the mature form of the nuc 
gene designated nucA which is generated by proteolytic cleavage of the N-terminal 
25 19 to 21 amino acids of the secreted propeptide designated Snase B (Shortle, (1983), 
supra). Three sense primers (nucSl, nucS2 and nucS3, Appendix 1) were designed, 
each one having a blunt-ended restriction endonuclease cleavage site for EcoRV or 
Smal in a different reading frame with respect to the nuc gene. Additionally Bglll 
and BamHI were incorporated at the 5' ends of the sense and anti-sense primers 
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respectively to facilitate cloning into BamHI and Bglll cut pTREPl. The sequences 
of all the primers are given in 'Appendix 1 . Three nuc gene DNA fragments 
encoding the mature form of the nuclease gene (NucA) were amplified by PCR using 
each of the sense primers combined with the anti-sense primer described above. The 
5 nuc gene fragments were amplified by PCR using S, aureus genomic DNA template. 
Vent DNA Polymerase (NEB) and the conditions recommended by the 

manufacturer. An initial denaturation step at 93 for 2 min was followed by 30 

cycles of denaturation at 93 for 45 sec, annealing at 50 for 45 seconds, and 

extension at 73 for 1 minute and then a final 5 ihin extension step at 73 ^C. The 
10 PCR amplified products were purified using a Wizard clean up column (Promega) to 
remove unincorporated nucleotides and primers. 

(c) Construction of the pTREPl-nuc vectors 

15 The purified nuc gene fragments described in section b were digested with Bgl II and 
BamHI usmg standard conditions and ligated to BamHI and Bglll cut and 
dephosphorylated pTREPI to generate the pTREPl-nucl, pTREPl-nuc2 and 
pTREPl-nuc3 series of reporter vectors. General molecular biology techniques were 
carried out using the reagents and buffer supplied by the manufacture or using 

20 standard conditions(Sambrook and Maniatis, (1989), supra). In each of the pTREPl- 
nuc vectors the expression cassette comprises a transcription terminator, lactococcal 
promoter PI, unique cloning sites (Bglll, EcoRV or Smal) followed by the mature 
form of the nuc gene and a second transcription terminator. Note that the sequences 
required for translation and secretion of the nuc gene were deliberately excluded in 

25 this construction. Such elements can only be provided by appropriately digested 

foreign DNA fragments (representing the target bacterium) which can be cloned into 
the unique restriction sites present immediately upstream of the nuc gene. 
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In possessing a promoter, the pTREPl-nuc vectors differ from the pFUN vector 
described by Poquet et al. (1998), supra, which was used to identify L, laais 
exported proteins by screening directly for Nuc activity directly in L. lactis. As the 
pFUN vector does not contain a promoter upstream of the nuc open reading frame 

5 the cloned genomic DNA fragment must also provide the signals for transcription in 
addition to those elements required for translation initiation and secretion of Nuc. 
This limitation may prevent the isolation of genes that are distant from a promoter 
for example genes which are within polycistronic operons. Additionally there can be 
no guarantee that promoters derived from other species of bacteria will be 

10 recognised and functional in L, lactis. Certain promoters may be under stringent 

regulation in the natural host but not in L, lactis. In contrast, the presence of the PI 
promoter in the pTREPl-nuc series of vectors ensures that promoterless DNA 
fragments (or DNA fragments containing promoter sequences not active in L. lactis) 
will still be transcribed. 

15 

(d) Screening for secreted proteins in 5. pneumoniae 

Genomic DNA isolated from S. pneumoniae was digested with the restriction 
enzyme Tru9I. This enzyme which recognises the sequence 5'- TTAA -3' was used 

20 because it cuts A/T rich genomes efficiently and can generate random genomic 
DNA fragments within the preferred size range (usually averaging 0.5 - 1.0 kb). 
This size range was preferred because there is an increased probability that the PI 
promoter can be utilised to transcribe a novel gene sequence. However, the PI 
promoter may not be necessary in all cases as it is possible that many Streptococcal 

25 promoters are recognised in L. lactis. DNA fragments of different size ranges were 
purified from partial Tru9I digests of S. pneumoniae genomic DNA. As the Tru 91 
restriction enzyme generates staggered ends the DNA fragments had to be made 
blunt ended before ligation to the EcoRV or Smal cut pTREPl-nuc vectors. This 
was achieved by the partial fill-in enzyme reaction using the 5* -3* polymerase 
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activity of Klenow enzyme. Briefly Tru9I digested DNA was dissolved in a solution 
(usually between 10-20 /il in total) supplemented with T4 DNA ligase buffer (New 
England Biolabs; NEB) (IX) and 33 of each of the required dNTPs, in this case 
dATP and dTTP. Klenow enzyme was added (1 unit Klenow enzyme (NEB) per fig 

5 of DNA) and the reaction incubated at 25^C for 15 minutes. The reaction was 
stopped by incubating the mix at 75°C for 20 minutes. EcoRV or Smal digested 
pTREP-nuc plasmid DNA was then added (usually between 200-400 ng). The mix 
was then supplemented with 400 units of T4 DNA ligase (NEB) and T4 DNA ligase 
buffer (IX) and incubated overnight at 16°C. The ligation mix was precipitated 

10 directly in 100% Ethanol and 1/10 volume of 3M sodium acetate (pH 5.2) and used 
to transform L. lactis MG1363 (Gasson, 1983). Alternatively, the gene cloning site 
of the pTREP-nuc vectors also contains a Bglll site which can be used to clone for 
example Sau3AI digested genomic DNA fragments. 

L. laais transformant colonies were grown on brain heart infusion agar and nuclease 

15 secreting (Nuc"*") clones were detected by a toluidine blue-DNA-agar overlay (0,05 
M Tris pH 9.0, 10 g of agar per litre, 10 g of NaCl per liter, 0. 1 mM CaC12, 0,03 % 
wt/vol . salmon sperm DNA and 90 mg of Toluidine blue O dye) essentially as 
described by Shortle, 1983, supra and Le Loir et aL, 1994, supra). The plates were 
then incubated at 37^C for up to 2 hours. Nuclease secreting clones develop an 

20 easily identifiable pink halo. Plasmid DNA was isolated from NuC^ recombinant L. 
laais clones and DNA inserts were sequenced on one strand using the NucSeq 
sequencing primer described in Appendix 1 , which sequences directly through the 
DNA insert. 

25 Isolation of Genes Encoding Exported Proteins from 
S. pneumoniae 

A large number of gene sequences putatively encoding exported proteins in S. 
pneumoniae have been identified using the nuclease screening system. These have 



23 



now been further analysed to remove artefacts. The sequences identified using the 
screening system have been analysed using a number of parameters. 

1. All putative surface proteins were analysed for leader/signal peptide 

5 sequences using the software programs Sequencher (Gene Codes Corporation) and 
DNA Strider (Marck, Nucleic Acids Res., 16:1829-1836 (1988)), Bacterial signal 
peptide sequences share a common design. They are characterised by a short 
positively charged N-terminus (N region) immediately preceding a stretch of 
hydrophobic residues (central portion-h region) followed by a more polar C-terminal 

10 portion which contains the cleavage site (c-region). Computer software is available 
which allows hydropathy profiling of putative proteins and which can readily 
identify the very distinctive hydrophobic portion (h-region) typical of leader peptide 
sequences. In addition, the sequences were checked for the presence of or absence of 
a potential ribosomal binding site (Shine-Dalgamo motif) required for translation 

15 initiation of the putative nuc reporter fusion protein. 

2. All putative surface protein sequences were also matched with all of the 
protein/DNA sequences using the publicly databases [OWL-proteins inclusive of 
SwissProt and GenBank translations]. This allows us to identify sequences similar to 
known genes or homologues of genes for which some function has been ascribed. 

20 Hence it has been possible to predict a function for some of the genes identified 

using the LEEP system and to unequivocally establish that the system can be used to 
identify and isolate gene sequences of surface associated proteins. We should also be 
able to confirm that these proteins are indeed surface related and not artifacts. The 
LEEP system has been used to identify novel gene targets for vaccine and therapy. 

25 3. Some of the genes identified proteins did not possess a typical leader 

peptide sequence and did not show homology with any DNA/protem sequences in 
the database. Indeed these proteins may indicate the primary advantage of our 
screening method, i.e. the isolation of atypical surface-related proteins^ which may 
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have been missed in all previously described screening protocols or approaches 
based on sequence homology searches. 



In all cases, only partial gene sequences were initially obtained. Full length genes 
5 were obtained in all cases by reference to the TIGR S, pneumoniae database 

fwww@tigr.org) . Thus, by matching the originally obtained partial sequences with 
the database, we were able to identiiy the full length gene sequences. In this way, as 
described herein, three groups of genes were clearly identified, ie a group of genes 
encoding previously unidentified S. pneumoniae proteins, a second group exhibiting 
10 some homology with known proteins from a variety of sources and a third group 

which encoded known S.pneumoniae proteins, which were, however, not known as 
antigens. 



Example 2: Vaccine trials 

15 

pcDNA3.1+ as a DNA vaccine vector 
pcDNA3.l4- 

20 The vector chosen for use as a DNA vaccine vector was pcDNA3.1 (Invitrogen) 
(actually pcDNA3.i + , the forward orientation was used in all cases but may be 
referred to as pcDNA3,l here on). This vector has been widely and successfully 
employed as a host vector to test vaccine candidate genes to give protection against 
pathogens in the literature (Zhang, et al, Kurar and Splitter, Anderson et al). The 

25 vector was designed for high-level stable and non-replicative transient expression in 
mammalian cells. pcDNA3.1 contains the ColEl origin of replication which allows 
convenient high-copy number replication and growth in E. coli. This in turn allows 
rapid and efficient cloning and testing of many genes. The pcDNA3.1 vector has a 
large number of cloning sites and also contains the gene encoding ampicillin 

30 resistance to aid in cloning selection and the human cytomegalovirus (CMV) 

immediate-early promoter/enhancer which permits efficient, high-level expression of 
the recombinant protein. The CMV promoter is a strong viral promoter in a wide 
range of cell types including both muscle and immune (antigen presenting) cells. 
This is important for optimal immune response as it remains unknown as to which 

35 cells types are most important in generating a protective response in vivo. A T7 
promoter upstream of the multiple cloning site affords efficient expression of the 
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modified insert of interest and which allows m vitro transcripiion of a cloned gene in 
the sense orientation. 

Zhang, D., Yang, X., Berry, J. Shen, C, McCIarty, G. and Brunham, R.C. (1997) 
5 "DNA vaccination with the major outer-membrane protein genes induces acquired 
immunity to Chlamydia trachomatis (mouse pneumonitis) infection" . Infection and 
Immunity, 176, 1035-40. 

Kurar, E. and Splitter, G.A. (1997) "Nucleic acid vaccination of Brucella abortus 
10 ribosomal L7/L12 gene elicits immune response". Vaccine, 15, 1851-57. 

Anderson, R., Gao, X.-M., Papakonstantinopoulou, A., Roberts, M. and Dougan, 
G. (1996) "Immune response in mice following immunisation with DNA encoding 
fragment C of tetanus toxin". Infection and Immunity, 64, 3168-3173. 

15 

Preparation of DNA vaccines 

Oligonucleotide primers were designed for each individual gene of interest derived 
using the LEEP system. Each gene was examined thoroughly, and where possible, 

20 primers were designed such that they targeted that portion of the gene thought to 
encode only the mature portion of the gene protein. It was hoped that expressing 
those sequences that encode only the mature ponion of a target gene protein, would 
facilitate its correct folding when expressed in manomalian cells. For example, in the 
majority of cases primers were designed such that putative N-terminal signal peptide 

25 sequences would not be included in the final amplification product to be cloned into 
the pcDNA3.1 expression vector. The signal peptide directs the polypeptide 
precursor to the cell membrane via the protein export pathway where it is normally 
cleaved off by signal peptidase 1 (or signal peptidase n if a lipoprotein). Hence the 
signal peptide does not make up any part of the mamre protein whether it be 

30 displayed on the surface of the bacteria surface or secreted. Where an N-terminal 
leader peptide sequence was not immediately obvious, primers were designed to , 
target the whole of the gene sequence for cloning and ultimately, expression in 
pcDNA3.1. 

35 Having said that, however, other additional feamres of proteins may also affect the 
expression and presentation of a soluble protein. DNA sequences encoding such 
features in the genes encoding the proteins of interest were excluded during the 
design of oligonucleotides. These feamres included: 

40 1. LPXTG cell wall anchoring motifs. 

2. LXXC ipoprotein attachment sites. 

3. Hydrophobic C-terminal domain. 
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4. Where no N-terminal signal peptide or LXXC was present the start codon was 
excluded. 

5. Where no hydrophobic C-terminal domain or LPXTG motif was present the stop 
codon was removed, 

5 

Appropriate PCR primers were designed for each gene of interest and any and al! of 
the regions encoding the above features was removed from the gene when designing 
these primers. The primers were designed with the appropriate enzyme restriction 
site followed by a conserved Kozak nucleotide sequence (in most cases(NB except in 

10 occasional instances for example ID59) GCCACC was used. The Kozak sequence 
facilitates the recognition of initiator sequences by eukaryotic ribosomes) and an 
ATG start codon upstream of the insert of the gene of interest. For example the 
forward primer using a BamH 1 site the primer would begin 
GCGGGATCCGCCACCATG followed by a small section of the 5' end of the gene 

15 of interest. The reverse primer was designed to be compatible with the forward 
primer and with a Notl restriction site at the 5' end in most cases (this site is 
TTGCGGCCGC) (NB except in occasional instances for example ID59 where a 
Xhol site was used instead of Notl). 

20 PCR primers 

The following PCR primers were designed and used to amplify the truncated genes 
of interest. 

25 ID5 

Forward Primer 5' 

CGGATCCGCCACCATGGGTCTAATTGAAGACTTAAAAAATCAA 3 ' 
Reverse Primer 5' TTGCGGCCGCCAATGCTAGACTAAACACAAGACTCA 3' 

30 

ID59 

Forward Primer 5' CGCGGATCCATGAAAAAAATCTATTCATTTTTAGCA 3' 
Reverse Primer 5* CCCTCGAGGGCTACTTCCGATACATTTTAAACTGTAGG 
35 3' 



ID51 

40 Forward Primer 5' CGGATCCGCCACCATGAGTCATGTCGCTGCAAATG 3' 
Reverse Primer 5' TTGCGGCCGCATACCAAACGCTGACATCTACG 3' 
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ID29 

Forward Primer 5' CGGATCCGCCACCATGCAAAAAGAGCGGTATGGTTATG 
5 Reverse Primer 5' TTGCGGCCGCACCCCCATTCTTAATCCCTT 3' 
ID50 

Forward Primer 5' 

10 CGGATCCGCCACCATGGAGGTATGTGAAATGTCACGTAAA3' 

Reverse Primer 5' TTGCGGCCGCTTTTACAAAGTCAAGCAAAGCC 3* 

Cloning 

1 5 The insert along with the flanking features described above was amplified using PGR 
against a template of genomic DNA isolated from type 4 S. pneumoniae strain 11886 
obtained from the National Collection of Type Culmres. The PCR product was cut 
with 'the appropriate restriction enzymes and cloned in to the multiple cloning site of 
pcDNA3.1 using conventional molecular biological techniques. Suitably mapped 

20 clones of the genes of interested were cultured and the plasmids isolated on a large 
scale (> L5 mg) using Plasmid Mega Kits (Qiagen), Successful cloning and 
maintenance of genes was confirmed by restriction mapping and sequencing 700 
base pairs through the 5' cloning junction of each large scale preparation of each 
construct. 

25 

Strain validation 

" A strain of type 4 was used in cloning and challenge methods which is the strain 
from which the 5. pneumoniae genome was sequenced. A freeze dried ampoule of a 

30 homogeneous laboratory strain of type 4 S, pneumoniae strain NCTC 11886 was 

obtained from the National Collection of Type Strains. The ampoule was opened and 
the culmred re suspended with 0.5 ml of tryptic soy broth (0.5% glucose. 5% 
blood). The suspension was subculmred into 10 ml tryptic soy broth (0.5% glucose, 
5% blood) and incubated sutically overnight at 37*'C. This culture was streaked on 

35 to 5% blood agar plates to check for contaminants and confirm viability and on to 
blood agar slopes and the rest of the culmre was used to make 20% glycerol stocks. 
The slopes were sent to the Public Health Laboratory Service where the type 4 
serotype was confirmed. 

40 A glycerol stock of NCTC 1 1886 was streaked on a 5% blood agar plate and 

incubated overnight in a C02 gas jar at 37°C. Fresh streaks were made and optochin 
sensitivity was confirmed. 
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Pneumococcal challenge 



A standard inoculum of type 4 S. pneumoniae was prepared and frozen down by 
5 passaging a culture of pneumococcus ix through mice, harvesting from the blood of 
infected animals, and grown up to a predetermined viable count of around 10^ 
cfu/ml in broth before freezing down. The preparation is set out below as per the 
flow chart. 



10 Streak pneumococcal culture and confirm identity 



15 Grow over-night culture from 4-5 colonies on plate above 



V 

20 Animal passage pneumococcal culture 

(i.p. injection of cardiac bleed to harvest) 



V 

25 

Grow over-night culture from animal passaged pneumococcus 



V 

30 

Grow day culture (to pre-determined optical density) from over-night of animal 
passage and freeze down at -lO^'C - This is standard minimum 



35 V 



Thaw one aliquot of standard inoculum to viable count 



40 V 



Use standard inoculum to determine effective dose (called Virulence Testing) 
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5 All subsequent challenges - use standard inoculum to effective dose 

An aliquot of standard inoculum was diluted 500x in PBS and used to inoculate the 
mice, 

10 Mice were lightly anaesthetised using halothane and then a dose of 1.4 x 10^ cfu of 
pneumococcus was applied to the nose of each mouse. The uptake was facilitated by 
the normal breathing of the mouse, which was left to recover on its back. 

5. pneumoniae Vaccine trials 

15 

Vaccine trials in mice were carried out by the administration of DNA to 6 week old 
CBA/ca mice (Harlan, UK)* Mice to be vaccinated were divided into groups of six 
and each group was immunised with recombinant pcDNAS.l + plasmid DNA 
~ containing a specific target-gene sequence of interest. A total of 100 /xg of DNA in 

20 Dulbecco's PBS (Sigma) was injected intramuscularly into the tibialis anterior 
muscle of both legs (50 ^1 in each leg). A boost was carried using the same 
procedure 4 weeks later. For comparison, control groups were included in all 
vaccine trials. These control groups were either unvaccinated animals or those 
administered with non-recombinant pcDNAS.l + DNA (sham vaccinated) only, 

25 using the same time course described above. 3 weeks after the second inmiunisation, 
all mice groups were challenged intra-nasally with a lethal dose of S. pneumoniae 
serotype 4 (strain NCTC 11886). The number of bacteria administered was 
monitored by plating serial dilutions of the inoculum on 5% blood agar plates. A 
problem with intranasal immunisations is that in some mice the inoculum bubbles out 

30 of the nostrils, this has been noted in results table and taken account of in 

calculations. A less obvious problem is that a certain amount of the inoculum for 
each mouse may be swallowed. It is assumed that this amount will be the same for 
each mouse and will average out over the course of inoculations > However, the 
sample sizes that have been used are small and this problem may have significant 

35 effects in some experiments. All mice remaining after the challenge were killed 3 or 
4 days after infection. During the infection process, challenged mice were monitored 
for the development of symptoms associated with the onset of 5, pneumoniae 
induced-disease. Typical symptoms in an appropriate order included piloerection, 
an increasingly hunched posmre, discharge from eyes, increased lethargy and 

40 reluctance to move. The latter symptoms usually coincided with the development of 
a moribund state at which stage the mice were culled to prevent further suffering. 
These mice were deemed to be very close to death, and the time of culling was used 
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to determine a survival time for statistical analysis. Where mice were found dead, 
the survival time was taken as the last lime point when the mouse was monitored 
alive. 

5 Interpretation of Results 

A positive result was taken as any DNA sequence that was cloned and used in 
challenge experiments as described above which gave protection against that 
challenge. Protection was taken as those DNA sequences that gave statistically 

10 significant protection (to a 95 % confidence level (p < 0.05)) and also those which 
were marginal or close to significant using Mann-Whitney or which show some 
protective features for example there were one -or more outlying mice or because the 
lime to the first death was prolonged. It is acceptable to allow marginal or non- 
significant results to be considered as potential positives when it is considered that 

15 the clarity of some of the results may be clouded by the problems associated with the 
administration of intranasal infections. 
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p value 2 refers to significance tests compared to pcDNAS.l + vaccinated controls 
Statistical Analyses. 

Trial 1 - None of the other groups had significantly longer survival times than the 
5 controls. The survival times of the unvaccinated and pcDNAS.l control groups were 
not significantly different. One of the mice from ID5 was an outlying result and the 
mean survival times for IDS were extended but not significantly so. 
Trial 2 - The group vaccinated with ID59 had significantly longer survival times than 
the unvaccinated control group, 
10 Trial 5 - The group vaccinated with ID59 again survived for an average of almost 10 
hours longer than the controls but the results were not quite statistically significant. 
Trial 6 - The group vaccinated with ID51 did not have survival times significantly 
higher than unvaccinated controls (p= <36.0), however, there were 2 outlying mice 
in the vaccinated group. 

15 

Vaccine trials 7 and 8 {See figure 2) 





Mean survival times (hours) 


Mouse 


Unvacc 


ID29 (7) 


Unvacc 


ID50 (8) 


number 


control (7) 




control (8) 




1 


59.6 


73.1 


45.1 


60.6 


2 


47.2 


54.8 


50.8 


60.6 


3 


59.6 


59.3 


60.4 


51.1 


4 


70.9 


54.8* 


55.2 


60.6 


5 


68.6* 


59.3 


45.1 


60.6 


6 


76.0 


54.8 


45.1 


60.6 


Mean 


63.6 


59.35 


50.2 


59.1 


sd 


10.3 


7.1 


6.4 


3.9 


p value 1 




<39.0 




0.0048 



* - bubbled when dosed so may not have received full inoculum. 
20 T - terminated at end of experiment having no symptoms of infection. 
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Numbers in brackets - survival times disregarded assuming incomplete dosing 
p value 1 refers to significance tests compared to unvaccinated controls 

Statistical Analyses. 

5 Trial 7 - The ID29 vaccinated group showed prolonged times to the first death. T 
Trial 8 - The group vaccinated with ID50 survived significantly longer than 
unvaccinated controls. 
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Appendix I - Oligonucleotide primers 



nucSl 



5 



Bglll EcoRV 
5'- cgagatctgatatctcacaaacagataacggcgtaaatag -3 



nucS2 



Bgl II Sma I 

5' - ga agatctt c cccgeg atcacaaacagataacggcgtaaatag -3' 



10 



nucS3 



Bgl II EcoRV 
5'- cgagatctgatatccatcacaaacagataacggcgtaaatag -3' 

15 nucR 

Bam HI 

5'- cgggatccttatggacctgaatcagcgttgtc -3* 
NucSeq 

20 5 ' - ggatgctttgtttcaggtgtatc -3 ' 
pTREPp 

5 ' - catgatatcggtacctcaagctcatatcattgtccggcaatggtgigggctttttttgttttagcggataa 
caautcacac -3 ' 



5 ' - gcggatcccccgggcttaattaatgtttaaacactagtcgaagatctcgcgaattctcctgtgtgaaatt 
gttatccgcta -3 * 



25 



pTREPR 



30 



pUCF 

5'- cgccagggttttcccagtcacgac -3* 



Vr 

5'- tcaggggggcggagcctatg -3* 



35 



Vi 

5'. tcgtatgttgtgtggaattgtg -3' 



□ 
m 

m 
m 

Q 



35 



V2 

5*- tccggctcgtatgttgtgtggaattg -3* 



# 
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TABLE 1 



ID4 1200 bo 

5 

ATGAGAAATATG TGGGTT GTAATCAAGGAAACCrATCTTCGACATGTCGAGTCATGGA Gi UCllLi i l ATGGTGA 
TTTCGCCGTTCCTCTTTTTAGGAATCTCTGTAGGAATTGGGCATCTCCAAGGT^ 

AGTGGCAGTAGTGACAACAGTGCCATCTGTAGCAGAAGGACTGAAGAATGTAAATGGTGTTAACTTCGACTATAA 
AGACGAAGCAAGTGCCAAAGAAGCAATTAAAGAAGAAAAATTAAAAGGTTArrrGACCATTGATCAAGAAGATA 

lU GTGTTCTAAAGGCAGTTTATCATGGCGAAACATCGCTTGAAAATGGAATrAAATITGAGGTrACAGGTACACTCA 
ATGAACTGCAAAATCAGCTTAATCGTTCAACTGCTTCCTTGTCTCAAGAGCAGGAAAAACGC^ 
TTCA ATTCACAG AAAAGATTGATGAAGCCAAGGAAAATAAAAAGTTTATTCAAACAATTGCAGCAGGTGCCTTAG 
GATTCriUClUUAT ATGATTC TGATTACCTATGCGGGTGTAACAGCTCAGGAAGTTGCCAGTGAAAAAGGCACCAA 
A ATTA TGGAAGTCGTTTTTTO'AGCATAAGGGCAAGTCACTATTTCTATGCGCGGATGATGGCT 

15 ATTTTAACGCATATTGGGATCTATGTTGTAGGTGGTCTGGCTGCCGTTTTGCT 

TCAGTCTGGTATTTTGGATCACTTGGGAGATGCTATCrCACTGAATACCTrGCTCTTrATT^ 
TGTAC GTAG TCTTGGCAGCC TTCCTAGG ATCTATGGTTTCTCGTCCTGAGGACTCAGGGAAAGCC^ 
GATGATTTTGATTA TGGG TGGT 11 1'l I'l'GGAGTGACAGCrCTAGGTGCAGCTGGTGACAATCTCCTCTTGAAGATT 
GGTTCTTATATTCCCTTTATTrCGACCTTCTTTATGCCGTTTCGAACGATTAATG 

20 CATGGATTTCACTTGCTATTACAGTGATTTTTGCGGTGGTAGCAACAGGATTTATCGGACGCATOT^ 
CGTTCTTCAAACGGATGATTTAGGGATTTGGAAAACCTTTAAACGTGCCrrATCTTATAAAT^ 

MRNMWVVIKETYLRHVESWSFFFMVISPFLFLGISVGlGHLQGSSMAKNNKVAVVTTVPSVAEGLKrA^NGVNFD^^ 

EASAKEAIKEEKLKGYLTIDQEDSVIXAVVHGETSLENGIKFEVTGTLNELQNQLNRSTASJLSQEQEKJU^Q^^^ 

25 DEAKENKiCFlQTUAGALGFFLYMILITYAGVTAQEVASEKGTKIMEVVFSSIi^HYFYARMMALFLVILTmGr^ 

GLAAVLLFKDLPFLAQSGILDHLGDAISLOTLLHUSLFMYVVl^FLGSMVSRPEDSGKAI^PLMILIMGGFFGVTA 

AAGDNLLLKIGSYIPFISTFFMPFRTINDYAGGAEAWISLAITVIFAVVATGFIGRMYASLVLQTDDLGIWKTFKRALSYK 
Z 

30 IDS 1125 bp 

CCTGGGAAAGTCTTGAAAATTATGATAGAATGGTGGAAGGAAAAATTCAGGAGAGTAGTAGTGACTCAAAATGTT 

GAAAGTCTTCTCGTATCCATrCTAATCAGTCC^TACAATGAAGAAAAATATCTGCCTGGTCTAATTGAAGACT^ 

AAAATCAAACCTATCCTAAAGAGGATATTGAAATTCTArrrATAAATGCTATGTCCACAGATGGGACCACAGCTA 

i5 TCATTCAGCAATTTATAAAGGAAGATACAGAGTTTAACTCAATTAGATrGTATAACAATCCT 

CTAGT GGTTTT AACCTGGGAGTTAAACATTCTGTAGGGGACCTTATrrrAAAAATTGATGCTCAT^ 

TGAGALll i i GTAATGAACAATGTGGCTATTATTCAACAAGGTGAATTTGTCTGTGGGGGGCCTAGACCGACGATT 

GTCGAAGGAAAAGGAAAATGGGCAGAGACCTTGCATCTrGTTGAGGAAAATATGTTTGGCAGTAGCATTGCCAAT 

TATCGAAATAGTTCTGAGGATAGATATGTTTCrrCTATTrrrCATGGAATGTATAAACGAGAGGT^ 

4U TTGGTTTAGTAAATGAGCAACTTGGCCGAACTGAAGATAATGATATTCATTATAGAATTCGAGAATATGGTTATAA 
AATCCGCTATAGCCCAAGTATTCTATCTTATCAGTATATTCGACCAACATTCAAGAAAATGCTGCATCAAAAGTAT 
TCAAATGGTTTGTGGAT TGGCT TGACAAGTCATCTTCAGTTTAAGTGTTTATCATTATrrCACT^ 
ATTTGTTTrGAGTCTTGTGm 

A TTTT C TACnT TTGTCA TTAC TCACTTTGCTGACTTTATTAAAAC^^ 
40 ATTTTATTTTCCATTCACTTTGCTrATGGCCTTGGGACGATTGTAGGTTTAAT^^ 

AGTACAAGAGAACAATAATTTATrrGGATAAAATAAGCCAAATAAATCAAAATATGCTATAA 

PGKVLKIMIEWWKEKFRRVVVTQNVESIXVSIVISAYNEEKYLPGLffiDLKNQTYPKEDIEILnNAMSTDGTTAIIQQFIK 
EDTEFNSIRLYNNPKKNQASGFNLGVKHSVGDLILKIDAHSKVTETFVMNNVAIIQQGEFVCGGPRPTIVEGKGKWAET 
5U LHLVEENMFGSSIANYRNSSEDRYVSSIFHGMYKREVFQKVGLVNEQLGRTEDNDIHYRIREYGYKIRYSPSILSYQYIRP 
TFKKMLHQKYSNGLWIGLTSHVQFKCLSLFHYVPCU^I^LVFSLALLPITFVFn'LLLGAYFLLl^LLTLLTLLKHKNGF 
LIVMPFILFSIHFAYGLGTIVGLIRGFKWKKEYKRTIIYLDKISQINQNMLZ 



55 



60 



IDU 696 bp 

ATGATGAAAGAACAAAATACGATAGAAATCGATGTATTTCAATTAGTTAAAAGCTTGTGG^^ 
ATTTTAATAGTGGCACTTGTGACAGGTGCGGGGGCTTTTGCATATAGCACTTTTATTGT^ 

GTACCACGCGAATTTACGTAGTGAATCGCAATCAAGGAGACAAGCCGGGGTTGACAAATCAGGATTTGCAGGCAG 

GAACTTATCTGGTAAAAGACTACCGTGAGATTATCCTTTCGCAGGATGTmGGAGGAAGTTGm 

ACTAGATTTGACGCCAAAAGGTTTGGCTAATAAAATTAAAGTGACAGTACCAGTTGATACCCGTATTGTCTCT^ 

TCAGTTAATGATCGAGTTCCTGAAGAGGCAAGCCGTATCGCTAACTCTTrGAGAGAAGTAGCTGCTCAAAAAATT 

ATCAGTATTACTCGTGTTTCTGA CGTGAC AACACTGGAGGAGGCAAGGCCGGCGATATCCCCGTCrrCGCCAAAT 

ATTAAACGCAATACACTAATTGGTTTTTTGGCAGGGGTGATTGGAACTAGTGTTATAGTTCTTC^^ 
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GGATACTCGTGTGAAACGTCCGGaaGATATCGAAAATACATTGCAGATGACACTTTTGGGACTTGTGCCAAACTT 
GGGTAAGTTGAAATAC 

MMKEQOTIEIDVFOLVKSLWKRKLMILIVALVTGAGAFAYSTHVKPE^TSTTRIYVVNRNQGDKPGLTNQDLQAGT^ 
5 VKDYREIIl^ODVl^EWSDUaDLTPKGLANKIKVTVPVDTRIVSISVNDRVPEEASRIANSLREVAAQKIISITRVSDVT 
TLEEARPAISPSSPNIKRNTUGFLAGVIGTSVIVLHLELLDTRVKRPEDIENTLQMTLLGVVPNLGKLKZ 

TD19 555 bp 

10 ATGGTAAAAGTAGCAGTTATATTAGCTC AGGG CriTGAAGAAATTGAAGCCTTGACAGTTGTAGATGTCT^ 

GAGCCA ATATC ACATGTGATATGGTTGGTTTTGAAGAGCAAGTAACGGGTTCGCATGCAATCCAAGTAAGAGCAG 
ATCATGTCTTTGATGGAGATrrATCAGACTATGATATGATTGTTCTTCCTGGAGGTATGCCTG 
CGTGATAATCAGACCTTGATTCAAGAATrGCAAAGCrrCGAGCAAGAAGGGAAGAAACTAGCAGCCATTTGTGCC 
GCACCAATTGCCCTCAATCAAGCAGAGATATTGAAAAATAAGCGATACACrTGTTATGACGGCGrrCAAGAGCAA 

Id atccttga tggt cactacgtcaaggaaacagtagtggtagatggtcagttgacaaccagtcggggtccttcaaca 

GCCCTTGCCTTTGCCTACGAGTTGGTGGAGCAACTAGGAGGGGACGCAGAGAGTTTACGAACAGGAATGCTCTAT 
CGAGATGTCTTTGGTAAAAATCAGTAA 

mvkvavilaqgfeeiealtwdvlrranitcdmvgfeeqvtgshaiqvradhvfdgdlsdydmivlpggmpgsahlr 
2U dnqtuqelqsfeqegkklaaicaapialnoaeilknkrytcydgvqeqildghyvketvvvdgqlttsrgpstalafa 
yelveqlggdaeslrtgmlyrjdvfgknqz 

ID27 306 bp 

25 gtgctagggatggtagaaccaaacctagaaagccttataaaagatctttacaatcatgctcgacatgatttgagt 
gaagatttagttgctgctctcctagagactactaaaaaacrgcctactacaaatgagcaattgcaggcagttcgt^ 
tctcaggcctggtcaatcgtgaattgctcctaaatcccaaacatccagcacctgagttgctcaact^ 
tgtcaaaagagaagaagccaagtacagaggaactgcgacttctgcgcttatgtatgaggaactctrtaaaatgct 

TTGA 



30 



35 



60 



mvgmvepnlesukdlynharhdlsedlvaallettkklpttneqlqavri^glvnrelllnpkhpapellnu^ri^ 
reeakyrgtatsalmyeelfkmlz 

ID29 945 bp 



TTGTrcrrAAAAAA GGAA AGAGAGGTAATCAGCATGCGTAAATGGACAAAAGGATTTCTCATCTTTGGTGTGGTG 
ACTACCGTTATCGGCrrrATCCTGCTTTTTGTAGGTATCCAATCTGACGGGAtTAAGAGCCT^ 
AGAACCTGTCTATGATAGCCGTACGGAAAAGCTAACCTTTGGCAAGGAAGTCGAAAACCTAGAAATTACTCTCCA 
CCAACACACGCTCACCATCACAGACTCTTTCGATGATCAAATCCACATTTCTTACCATCCATCTCTTTCTGC^ 
4U CATGATCTTATCACCAATCAGAACGATAGAACTCTGAGTCTCACTGATAAGAAACTGTCTGAAACTCCGTTTCTCT 
CTTCTGGAATTGGTGGGATTCrrCATATCGCAAGTAGCTACTCTAGTCGTTTTGAAGAAGTTATTCT^ 
AAAAGGGAGAACTCTAAAAGGGATCAACATCTCAGCCAATCGCGGACAAACCACCATCATAAATGCTAGCCTTGA 
AAATGCGACCCTCAATA CAAAC AGCTATATCCTCCGAATTGAAGGAAGTCGTATCAAAAACAGTAAACTCACAAC 
GCCCAATATCGTTAATATCrn-GATACAGTTCTTACAGATAGTCAGCTAGAGTCAACAGAGAATCACTTCCACGCT 
GAAAATATCCAAGTCCATGGCAAGGTTGAACTGACTGCCAAAGATTATCTCAGAATCATCCTAGACCAGAAAGAA 
AGCCAACGAATTAACTGGGACATCTCAAGCAACTATGGTrCTATCTTCCAATTCACAAGAGAAAAGCCTGAATCA 
AGAGGTACGGAATTAAGCAACCCTTACAAAACTGAAAAAACCGATGTCAAGGATCAACTCATTGCGAGATCTGAT 
GATAATATTGATCTAATATCCACACCAAGCAGACGTTGA 

50 MFLKKEREVISMRKWTKGFLIFGVVTTVIGFILLFVGIQSDGIKSLl^MSKEPVYDSRTEKLTFGICEVENLErrLHQ 

tdsfddqihisyhpslsahhdlitnqndrtlsltdkklsetpflssgiggilhiassyssrfeevilrlpkgrtlkginisanr 
gqttiinaslenatl^^^nsyilriegsrik:nsklttpnivnifdtvltdsqlestenhfhaeniqvhgkveltakdylriild 
qkesqrinwdissnygsifqftrekpesrgtelsnpyktektdvkdqliarsddnidlistpsrrz 

55 ID30 879 hp 

ATGAAACAAGAATGGTTTGAAAGTAATGArrTTGTAAAAACAACAAGCAAGAACAAGCCTGAAGAGCAAGCTCA 
AGAGGTTGCACACAAGGCTGAAGAAACGATAGCCGATCTCGATACACCAATTGAAAAAAATACTCAGTTAGAGG 
AGGAAGTCCCTCAAGCTGAAGTCGAATTGGAAAGCCAGCAAGAAGAGAAAATTGAAGCTCCTGAAGACAGTGAA 

gcgagaacagaaatagaagaaaagaaggcatctaattctactgaagaagagccagacctttctaaagaaacaga 

AAAAGTCACTATAGCTGAAGAGAGCCAAGAAGCTCrrCCrCAGCAAAAAGCAACCACGAAAGAGCCACrrC^ 
TGATTTTTGGTCTTGGCTAGTGGAAG^^ 

ACAGCCTTTCTCTTGCTCATTCrGTTrrCTGCATCTTCC I 1 1 Ti C r rrAGTATCTATCACATCAAACATGCTTACTAT 
OD GGACATATAGCAAGCATTAACAGTCGCrrrcCCTGAGCAGCTAGCTCCTTTAACT Cl 1 1 li I CTATCATCTCTATCCT 
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AGTAGCGACAACACrcnei iClTl 1 CATrCCTCTTGGGTAGTTTCGTTGTGAGACGATTTATCCACCAGGAAAAG 
GACTGGACGCTAGACAAGGTTCTCCAACAATATAGTCAACTCTTGGCAATTCCAATCTCCTCACTGCTATTGCTAG 
' IT l'CI' I TG Cl 1 ILl i I GATAGCCTACGATTTACAGCCCTCTTGTgTGTGA 

5 MKQEWFESNDFVKTTSKNKPEEQAQEVADKAEETIADLDTPIEKNTQLEEEVPQAEVELESQQEEKIEAPEDSEARTEIE 
EKKASNSTEEEPDLSKETEKVTIAEESQEALPQQKATTKEPLLISKSLESPyiPDQAPKSRDKWKEQVLDFWSWLV 
PTSKL£TS^^HSYTAFIX^JLFSASSFFFSIYHIKHAyYGHlASINSRFP£Q^J^PLTLFS^SILVAT^-FF 
QEKDWTLDKVLQQYSQLLAIPISSLLLLVSLLSLIAYDLQPSCV2 

10 ID105 990 bp 

ATGCAACTCGCTTOTCGGTCTACTCATTGTTCGTCTGGTACAATTTGTTCTTAAAAAAGGA>^ 

GCATGCGTAAATGGACAAAAGGArrrCTCATCTTTGGTGTGGTGACTACCGTTATCGGCTTTATCCTG Cil - r 

GGTATCCAATCTGACGGGATTAAGAGCCTACTTTCCATGTCCAAAGAACCTGTCTATGATAGCCGTACGGAAAAG 

15 CTAACCTTTGGCAAGGAAGTCGAAAACCTAGAAATTACTCTCCACCAACACACGCTCACCATCACAGACTCTTTC 
GATGATCAAATCCACArn'CTTACCATCCATCrCTTTCTGCTCACCATGATCTTATCACCAATCAGAACGATAGAA 
CTCTGAGTCTCACTGATA AGAA ACTGTCTGAAACrCCGTTTCTCTCTTCTGGAATTGGTGGGATTCTTCAT 
AAGTAGCTACTCTAGTCGTTTTGAAGAAGTTATrCTCCGACTACCAAAAGGGAGAACTCTAAAACGGATCAA 
CTCAGCCAATCGCGGACAAACCACCATCATAAATGCTAGCCTTGAAAATGCGACCCrCAATACAAACAGCTATAT 

20 CCTCCGAATTGAAGGAAGTCGTATCAAAAACAGTAAACTCACAACGCCCAATATCGrrAATATCTTTGATACAGTT 
CTTACAGATAGTCAGCTAGAGTCAACAGAGAATCACTTCCACGCTGAAAATATCCAAGTCCATGGCAAGGTrGAA 
CTGACTGCCAAAGA7TATCTCAGAATCATCCTAGACCAGAAAGAAAGCCAACGAATTAACTGGGACATCTCAAGC 
AACTATGGTTCTATCTTCCAATTCACAAGAGAAAAGCCTGAATCAAGAGGTACGGAATTAAGCAACCCrTACAAA 
ACTGAAAAAACCGATGTCAAGGATCAACTCATTGCGAGATCTGATGATAATATTGATCTAATATCCACACCAAGC 

25 AGACGTTGA 
f 

MQLASSVYSLFVWNLFLKKEREVISMRK\m:GFLIFGVVTTVIGFILLF^GIQSDGlKSLLSM 

KEVENl^rrLHQHTLmDSFDDQIHISYHPSLSAHHDLrrNQNDRTl^LTDKKLSETPFLSSGlGGILHlASSYSSRFEEV^ 
RLPKGRTLKGINISANRGQTTnNASLENATLhTTNSYILRIEGSRIKNSKLTTPNIVNIFDTVLTDSQLESTENHFHAENI^^ 
30 HGKVELTAKDYLRIILDQKESQRINWDISSNYGSIFQFTREKPESRGTELSNPYKTEKTDVKDQLIARSDDNIDUSTPS^ 
Z 

mi 07 -78bp 

35 

ATGATATGTAAAATGAAGCAGGGAGGGAGCAGGGCGTGCTGGGGATGGAGAGTGGGGGAGGGACGCTGCTATTT 
TAATC 

MICKMKQGGSRACWGWRVGEGRCYFN 
ID109 714 bp 

CGATAAAGAGGCnTGAGTAATCTCAATTTGCAGATTGAAAATGGAGAGATTATGGGCTTGATTGGTCATAATGG 
45 GGCTGGAAAATCGACCACTATAAAATCCCTAGTCAGTATCATTTCACCCAGCAGTGGTCGTATTTTGGTAGACGGT 
CAGG AGTTATCGG AAAATCGCrrGGCT ATTAAACG AAAG ATTGGCTACGTAGCAG ACTCGCCTGACTTA'l '1 i-j i AC 
GCTTAACGGCCAATCAATriTGGGAATTGATCGCCTCATCCTATGATCrGAGTAGATCrGACT^ 
AGCTAGGCTATTGAACGTrTTTGATTTTGCTGAAAATCGCTATCAGGTTATTGAAACTCT^ 
CAGAAAGTCTTTGTCATCGGAGCACTCTTGTCTGATCCCGATATTTGGGTTITGGACGAACCCTTGACTGC^^ 
50 ATCCCCAGGCTGCCmGATTTGAAACAGATGATGAAGGAACATGCACAAAAAGGGAAGACAGTCTTCTTTTCAA 
CTCATGTCCTAGAGGTGGCAGAGCAAGTCTGTGATCGGATTGCCATTTTGAAAAAGGGGCATTTGATTTATTGTGG 
TAAGGTAGAGGACTTGAGGAAAGACCACCCAGACCAGTCTTTGGAAAGTATCTACCTTAGTCrrGCTGGTAGAAA 
AGAGGAGGTTGCGGATGCGTCTCAAGGTCATTAA 

55 DKEALSNLNLQIENGEIMGLIGHNGAGKSTTIKSLVSIISPSSGRILVDGQELSENRLAIKRKIGYVADSPDLFLRLTANEF 
WEUASSYDLSRSDLHASLARLLNVFDFAENRYQVIETLSHGMRQKVFVIGALLSDPDIWVLDEPLTGLDPQAAFDLICQ 
MMKEHAQKGKTVLFSTHVLEVAEQVCDRIAILKKGHLIYCGKVEDLRKDHPDQSLESIYLSLAGRKEEVADASQGHZ 

lD\n 360 bp 

60 

ATGGCTTTGTTTTCAGAGAGAGGAGCAGTACGGAAGACACCAATGGCAAGTCCAATAATGAGACCTATGATGGTT 
CCGACGATAGAGATTAAAAGAGTGATACCAGCACCACGCAAGAGTTGTrGCCAGTTTTCAGAAAGAATTrTAGCA 
ACTTGGCTAAAGAAACTA CTGCT AGTCTCnTCAGTTGTTGTAGCTTCGGCAGGTTGTTCCT^ 
TCAAGGCAACTTGGTCATCTTTTGAAATGGTTTCAATGCTGGCATTGATTTGGCTAATACGATTGTCATr^ 
65 AGCCCGATAGCGATAGCrGTATCTTCTTCCCCAGTrrTGAAACCAGGTTCTACTTGA 
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MALFSERGAVRKTPMASPlMRPMMVPTIEIKRVIPAPIOCSCCQFSERlLATWLKKLLLVSSVVVASAGCSUmSIKATWSS 
FEMVSMLALIWLIRLSFLRSPlAiAVSSSPVLKPGSTZ 

5 ID 128 - 3.43 

ATGAAATTTAGTAAAAAATATATAGCAGCTGGATCAGCTGTTATCGTATC 

CTTGAGTCTATCTGCCTATGCACTAAACCAGCATCGTTCGCAGGAAAATA 

AGGACAATAATCGTGTCTCTTATGTGGATGGCAGCCAGTCAAGTCAGAAA 
1 0 AGTGAAAACTTGACACCAGACCAGGTTAGCCAGAAAG AAGG AATTC AGGC 

TGAGCAAATTGTAATCAAAATTACAGATCAGGGCTATGTAACGTCACACG 

GTGACCACTATCATrACTATAATGGGAAAGTTCCTTATGATGCCCTCrTT 

AGTGAAGAACTCTTGATGAAGGATCCAAACTATCAACTTAAAGACGCTGA 

TATTGTCAATGAAGTCAAGGGTGGTTATATCATCAAGGTCGATGGAAAAT 
1 5 ATrATGTCTACCTGAAAGATGCAGCTCATGCTG ATAATGTTCG AACTA AA 

GATGAAATCAATCGTCAAAAACAAGAACATGTCAAAGATAATGAGAAGGT 

TAACTCTAATGTTGCTGTAGCAAGGTCTCAGGGACGATATACGACAAATG 

ATGGTTATGTCTTTAATCCAGCTGATATTATCGAAGATACGGGTAATGCT 

TATATCGTTCCrCATGGAGGTCACTATCACTACATTCCCAAAAGCGATTT 
20 ATCTGCTAGTGAATTAGCAGCAGCTAAAGCACATCTGGCTGGAAAAAATA 

TGCAACCGAGTCAGTTAAGCTATTCTTCAACAGCTAGTGACAATAACACG 

CAATCTCTAGCAAAAGGATCAACTAGCAAGCCAGCAAATAAATCTGAAAA 

TCrCCAGAGTCTTTTGAAGGAACTCTATGATTCACCTAGCGCCCAACGTT 

ACAGTGAATCAGATGGCCTGGTCTTTGACCCTGCTAAGATTATCACTCGT 
25 ACACCAAATGGAGTTGCGATTCCGCATGGCGACCATTACCACTTTATTCC 

TTACAGCAAGCTTTCTGCCTTAGAAGAAAAGATTGCCAGAATGGTGCCTA 

TCAGTGGAACTGGrrCTACAGrrrCTACAAATGCAAAACCTAATGAAGTA 

GTGTCTAGTCTAGGCAGTCrrTCAAGCAATCCTTCTTCTTTAACGACAAG 

TAAGGAGCTCTCTTCAGCATCTGATGGTTATATrnTAATCCAAAAGATA 
30 TCGrrGAAGAAACGGCTACAGCrrATATTGTAAGACATGGTGATCATTTC 

CATTACATTCCAAAATCAAATCAAATTGGGCAACCGACTCTTCCAAACAA 

TAGTCTAGCAACACCTTCTCCATCTCTTCCAATCAATCCAGGAACTTCAC 

ATGAGAAACATGAAGAAGATGGATACGGATTTGATGCTAATCGTATTATC 

GCTGAAGATGAATCAGGmTGTCATGAGTCACGGAGACCACAATCATTA 
35 TTTCTTCAAGAAGGACTTGACAGAAGAGCAAATTAAGGTGCGCAAAAACA 

TTTAG 

MKFSKKYIAAGSAVrVSLSLCAYALNQHRSQENKDNNRVSYVDGSQSSQK 

SENLTPDQVSQKEGIQAEQIVIKITDQGyVTSHGDHYHYYNGKVPYDALF 
40 SEELLMKDPNYQLKDAOrVNEVKGGYIIKVDGKYYVYLKDAAHADNVRTK 

DEINROKQEHVKDNEKVNSE^AVARSQGRYTTNDGYVFNPADIIEDTGNA 

YIVPHGGHYHYIPKSDLSASELAAAKAHLAGKNMQPSQLSYSSTASDNNT 

QSVAKGSTSKPANKSENLQSLUCELYDSPSAQRYSESDGLVFDPAKIISR 

TPNGVAlPHGDHYHFIPYSKLSALEEKlARMVPISGTGSTVSTNAfCPNEV 
45 VSSLGSLSSNPSSLTTSKELSSASDGYIFNPKDIVEETATAYIVRHGDHF 

HYIPKSNQIGQPTLPNNSLATPSPSLPINPGTSHEKHEEDGYGFDANRII 

AEDESGFVMSHGDHNHYFFKKDLTEEQIKVRKNI* 
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TABLE 2 
ID2 840 bp 

5 ATGGGAATTGCTCTAGAAAATGTGAATTTTACATATCAAGAAGGTACTCCCTTAGCTTCAGCAGC^ 
TTTCTTTGACGATTGAAGATGGCTCTrATACAGCTTTAATTGGGCACACAGGTAGTGGTAAATC^ 
ACTCTTAAATGGTTTATTGGTGCCAAGTCAAGGGAGTGTGAGGGTTTTTGATACOT 

AAT AAAG ATATTCGTCAA ATTAG AAAACAGGTTGGCTTGGTATTTCAGTTTGCTGAAAATCAGA l 1 1 ' i ' i 'GAAGAAA 
CGGTTTTGAAGGACGTTGCTTTTGGACCGCAAAATmCGAGTTTCTGA 

10 G AAACTGGCTCTGGTTGG AATTGATG AATCAC i"! ' i 1 ' 1 GATCCTAGTCCGTTTG AGCTGTCAGGGGG ACA AATG AG A 

CGTGTTGCCATTGCAGGCATACTTGCCATGGAGCCAGCTATATTAGTCrrAGATGAGCCAACAGCTGGTCTAGATC 
CTCTAGGGAGAAAAGAGTTGATGACCCTGTTCAAAAAACTCCACCAGTCAGGGATGACCATCGTCTTGGTAACGC 
ATTTGATGGATGA TGTTG CTGAATATGCGAATCAAGTCTATGTAATGGAAAAGGGACGTTTAGTAAAGGGGGGCA 
AACCAAGTGATGTCTTTCAAGACGTTGTrmATGGAAGAAGTTCAGTTGGGAGTACCTAAAATTACGGCCTT^ 

15 TAAACGATTGGCTGATAGAGGCGTGTCATTTAAACGATTACCGATTAAGATAGAGGAGTTCAAGGACTCGCTAAA 
TGGATAG 

MGlALENVNFTYQEGTPLASAALSDVSLTIEDGSYTALIGHTGSGKSTILQLLNGIXVPSQGSVRVFDTLrrSTSKNKDIR 
QIRKQVGLVFQFAENQIFEETVLKDVAFGPQNFGVSEEDAVKTAREKLALVGIDESLFDRSPFELSGGQMRRVAIAGILA 
20 MEPAILVLDEPTAGLDPLGRKELMTLFKKLHOSGMTlVLVTHLMDDVAEyANQVyVMHKGRLVKGGKPSDVFQDW 
FMEEVQLGVPICrrAFCKRLADRGVSFKRLPIKlHEFKESLNGZ 

ID 3 6360 bp 

25 tacccggtagtcttagcagacacatctagctctgaagatgcrrraaacatctctgataaagaaaaagtagcag)^ 
aataaagagaaacatgaaaatatccatagtgctatggaaacttcacaggattttaaagagaagaaaacagcagtc 
attaaggaaaaagaagttgttagtaaaaatcctgtgatagacaataacactagcaatgaagaagcaaaaatcaa 
agaagaaaattccaataaatcccaaggagatratacggacrcatttgtgaataaaaacacagaaaatcccaa 
agaagataaagttgtctatattgctgaatttaaagataaagaatctggagaaaaagcaatcaaggaactatccag 

30 tcttaagaatacaaaagttttatatacttatgatagaatttttaacggtagtgccatagaa^ 

ttggacaaaattaaacaaatagaaggtatrtcatcggttgaaagggcacaaaaagtccaacccatgatgaatcat 
gccagaaaggaaattggagttgaggaagctattgattacctaaagtctatcaatgctccgtttgggaaaaat^ 
gatggtagaggtatggtcatttcaaatatccatactggaacagattatagacataaggctatgagaatcgatgat 
gatgccaaagcctcaatgagatttaaaaaagaagacrraaaaggcactgataaaaattatrggttgagtgataaa 

35 atccctcatgcgttcaattattataatggtggcaaaatcactgtagaaaaatatgatgatggaagggatratntg 
acccacatgggatgcatattgcagggattcttgctggaaatgatactgaacaagacatcaaaaactttaacggca 
tagatggaattgcacctaatgcacaaattrrctcttacaaaatgtattctgacgcaggatctgggtttgcgggtga 
tgaaacaatgttrcatgctattgaagattctatcaaacacaacgttgatgttgtttccgtatcatctggtmaca 
ggaacaggtcttgtaggtgagaaatattcgcaagctattcgggcattaagaaaagcaggcattccaatggttgtc 

40 gctacgggtaactatgcgacttctgcttcaagttcttcatgggatttagtagcaaataatcatctgaaa^ 

acactggaaatgtaacacgaactgcagcacatgaagatgcgatagcggtcgcrrctgctaaaaatcaaacagttg 
agrrrgataaagttaacataggtggagaaagttrtaaatacagaaatataggggcctttttcgataagactaaaa 
tcacaacaaatgaagatggaacaaaagctcctagtaaattaaaatttgtatatataggcaaggggcaagaccaag 
atttgataggtttggatcttaggggcaaaattgcagtaatggatagaatttatacaaaggatttaaaaaatgct^ 

4:? taaaaaagcratggataagggtgcacgcgccattatggttgtaaatactgtaaattactacaatagagataattg 
gacagagcrrccagctatgggatatgaagcggatgaaggtactaaaagtcaagtgttttcaatttcaggagatga 
tggtgtaaagctatggaacatgattaatcctgataaaaaaactgaagtcaaaagaaataataaagaagatrrra^ 
agataaatt ggag caatactatccaattgatatggaaagttttaattccaacaaaccgaatgtaggtgacgaaaa 

AGAGATTGACTTTAAGTTTGCACCTGACACAGACAAAGAACTCTATAAAGAAGATATCATCGTTCCAGCAGGATC 
DU TACATCTTGGGGGCCAAGAATAGATTTACTTTTAAAACCCGATGTTTCAGCACCTGGTAAAAATATT^^ 

CTTAAT GTTA TTAATGGCAAATCAACTTATGGCTATATGTCAGGAACTAGTATGGCGACrCCAATCGTGGCAGCTT 
CTACTGTTTTGArrAGACCGAAATTAAAGGAAATGaTGAAAGACCTGTATTGAAAAATCTTA^^ 
AAATAGATCTTACAAG TCrrA CAAAAATTGCCCTACAAAATACTGCGCGACCTATGATGGATGCAACrrCTTGGA 
AAGAAAAAAGTCAA TACTT TGCATCACCTAGACAACAGGGAGCAGGCCTAATrAATGTGGCCAATGCTTTGAGAA 
ATGAAGTTGTAGCAACTTTCAAAAACACTGATTCrAAAGGTTrGGTAAACTCATATGGTO 
AATAAAAGGTGATAAAAAATACTTTACAATCAAGCTTCACAATACATCAAACAGACCmGACrmAAAG^ 
GCATCAGCGATAACTACAGATTCTCTAACTGACAGATTAAAACTTGATGAAACATATAAAGATGAAAAATCTCCA 
GATGGTAAGCAAATTGTTCCAG AAATT CACCCAGAAAAAGTCAAAGGAGCAAATATCACATTTGAGCATGATACT 
TTCACTATAGGCGCAAATTCTAGCTTTGATTTGAATGCGGTTATAAATGTTGGAGAGGC 
OU TTrGTAGAATCATTTATrCATTTTGAGTCAGTGGAAGCGATGGAAGCTCTAAACTCCAGCGGGAAGAAAATAAA 
TTCCAACCri'Cl-rrGTCGATGCCTCTAATGGGATTTGCTGGGAATTGGAACCACGAACCAATCCTTGATAAATGGG 
CTTGGGAAGAAGGGTCAAGATCAAAAACACTGGGAGGTTATGATGATGATGGTAAACCGAAAATTCCAGGAACCT 
TAAATAAGGGAATTGGTGGAGAACATGGTATAGATAAATTTAATCCAGCAGGAGTTATACAAAATAGAAAAGATA 
AAAATACAACATCCarGGATCAAAATCCAGAATrATTrGCTTTCAATAACGAAGGGATCAACGCTCCATCATCAA 
D3 GTGGTTCTAAGATTGCTAACATrrATCCnTrAGATTCAAATGGAAATCCTCAAGATGCTCAACT^ 
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AACACCTTCTCCACTrGTATTAAGAAGrGCAGAAGAAGGArrGATTTCAA TAGTA AATACAAATAAAGAGGGAGA 
AAATCAAAGAGACTTAAAAGTCATTTCGAGAGAACACTTTATTAGAGGAATmAAATTCTAAAAGCAATGAT^ 
AAAGGGAATCAAATCATCTAAACTAAAAGTTTGGGGTGACTTGAAGTGGGATGGACTCATCTATAATCCTAGAGG 
TAGAGAAGAAAATGCACCAGAAAGTAAGGATAATCAAGATCCTGCTACTAAGATAAGAGGTCAATTTGAACCGAT 
5 TGCGGAAGGTCAATATrrCTATAAArrrAAATATAGATTAACrAAAGATTACCCATGGCAGGrrrCCTATArrCCT 

gtaaaaattgataacaccgccccraagattgtttcggttgatttrrcaaatcctgaaaaaattaagto 
aggatacttatcataaggtaaaagatcagtataagaatgaaacgctatttgcgagagatcaaaaagaacatcctg 
aaaaatttgacgagattgcgaacgaagrrtggtatgctggcgccgctcrrtgtraatgaagatggagaggt^ 
aaaatcrrgaagtaacrtacgcaggtgagggtcaaggaagaaatagaaaacttgataaagacggaaataccattt 

10 atgaaattaaaggtgcgggagamaaggggaaaaatcatrgaagtcamcattagatggttctagcaatttca 
caaagattcatagaattaaatttgctaatcagcctgatgaaaaggggatgattrcctattatctagtagatcctca 
tcaagattcatctaaatatcaaaagcttggcgagattgcagaatctaaatttaaaaatttaggaaatggaaaaga 
ggg tagtc taaaaaaagatacaactggggtagaacatca tcat caagaaaatgaagagtctattaaagaaaaat 
ctagttttactattgatagaaatxmcaacaattagagacmgaaaata^ 

15 gaaarrragagaagttgatgattttacaagtgaaactggtaagagaatggaggaatacgattataaatacgatga 
taaaggaaatataatagcctacgatgatgggactgatctagaatatgaaactgagaaacrrgacgaaatcaaatc 
aaaaatttatggtgttctaagtccgtctaaagatggacactttgaaattcttggaaagataagtaatgm 
aatgccaaggtatattatgggaataactataaatctatagaaatcaaagcgaccaagtatgatttccactcaaaa 
acgatgacatttgatctatacgctaatattaatgatattgtggatggattagcttttgcaggagatatgagattat 

20 ttgttaaagataatgatcagaaaaaagctgaaattaaaatragaatgcctgaaaaaattaaggaaactaaatcag 
aatatccctatgtatcaagttatgggaatgtcatagaattaggggaaggagatctttcaaaaaacaaaccagaca 

ATTTAACTAAAATGGAATCTGGTAAAATCTATTCTGATTCAGAAAAACAACAATATCTGTTAAAGGATAATATCAT 

tctaagaaaaggctatccactaaaagtgactacctataatcctggaaaaacggatatgttagaaggaaatggagt 
ctatagcaaggaagatatagcaaaaatacaaaaggccaatcctaatctaagagcccrrrcagaaacaacaattta 

25 tgctgatagtagaaatgttgaagatggaagaagtacccaatctgtattaatgtcggctttggacggcritaacatt 
ataaggtatcaagtgtttacatttaaaatgaacgataaaggggaagcratcgataaagacggaaatcttgtgaca 
gattcttctaaacrtgtattatttggtaaggatgataaagaatacactggagaggataagttcaatgtagaagcta 
taaaagaagatggctccatgrratttattgataccaaaccagtaaaccrrtcaatggataagaactaa^ 
atctaaatctaataaaattratgtacgaaatccagaattttatttaagaggtaagatrrct^ 

30 aactgggaattgagagttaatgaatcggttgtagataattatrraatctacgqagatttacacattgataacact^ 
gagattttaatattaagctgaatgttaaagacggtgacatcatggactggggaatgaaagactataaagcaaac^ 
gatn-ccagataaggtaa cagata tggatggaaatgtttatmcaaactggcratagcgatttgaatgctaaagc 
agttggagtccactatcagtttttatatgataatgttaaacccgaagtaaacattgatccraagggaaatactagt 
atcgaatatgctgatggaaaatctgtagtctttaacatcaatgataaaagaaataatggattcgatggtgagatt 

35 caagaacaacatatttatataaatggaaaagaatatacatcatttaatgatattaaacaaataatagacaagaca 
ctaaacattaagatrgrrgtaaaagatttrgcaagaaatacaaccgtaaaagaattcatttraaata^ 
ggagaggtaagtg aatta aaacctcatagggtaactgtgaccattcaaaatggaaaagaaatgagttcaacgata 
gtgtcggaagaagattttattttacctgtttataagggtgaattagaaaaaggataccaam 
tttctggtn-cgaaggtaaaaaagacgctggctatgttattaatctatcaaaagataccttrataaaacct 

40 caagaaaatagaggagaaaaaggaggaagaaaataaacctacttttgatgtatcgaaaaagaaagataacccac 
aagtaaaccatagtcaattaaatgaaagtcacagaaaagaggarrtacaaagagaagagcattcacaaaaatct 
gattcaactaaggatgttacagctacagttcttgataaaaacaatatcagtagtaaatcaactactaacaatcct 
aataagttgccaaaaactggaacagcaagcggagcccagacactattagctgccggaataatgtttatagtagga 

Al'Vm CVl GGATTGAAGAAAAAAAATCAAGATTAA 

45 

YPVVU^DTSSSEDALNISDKEKVAE^OCEiCHENlH$AMETSQDFKEKKTAVIKEKEVVSKNPVIDNNTSNEEAKIKEENSN 
KSQGDYTDSI^^flC^aENPKKEDKVVYlAEFKDKESGEKAIKELSSLK^r^KVLYTYDiUFNGSAIETTPDNLD^:IKQIEGIS 
SVERAQKVQPMMNHARKEIGVEEAIDYLKSINAPFGKNFIXjRGMVlSNIDTGTDYRHKAMRlDDDAKASMRHCKEDL 
KGTDKNYWLSDlCIPHAFNYYNGGKnrVEKYDDGRDYFDPHGMHlAGILAGNDTEQDIKNFNGIDGIAPNAQIFSYKMY 

50 SDAGSGFAGDETMFHAJEDSIKHNVDVVSVSSGFTGTGLVGEKYWQAIRALRKAGIPMWATGNYATSASSSSWDLVA 
NNHLKMTDTGNVTRTAAHEDAIAVASAKNQTVEFDKVNIGGESFKYRNIGAFFDKSKITTNEDGTKAPSKLKFVYIGK 
GQDQDLIGLDUiGKlAVMDRIYTKDLKNAFKKAMDKGARAIMVVim^NYYNRDNWTELPAMGYEADEGTKSQVFSI 
SGDDGVKLWNMINPDKKTEVKRNNKEDFKDKLEQYYPIDMESFNSNKPNVGDEKEIDFKFAPDTDKELYKEDnVPAG 
STSWGPRIDLUJCPDVSAPGKNIKSTLNVINGKSTYGYMSGTSMATPIVAASTVLIRPKUCEMLERPVUCNLKGD^ 

55 TSLTKIALO^^rARPMMDATS^VKEKSQYFASPRQQGAGUNVANALRNEVVATFK^n^DSKGLVNSYCSISLK£IKGDKK 
YFriKLH>rrSNRPLTFKVSASArrrDSLTDRLKl.DETYKDEKSPDGKQIVPEIHPEKVKGANrrFEHDTFriGANSSFDLN 
AVINVGEAKNKNKFVESFIHFESVEAMEALNSSGKKINFQPSLSMPLMGFAGNWNHEPILDKWAWEEGSRSKTLGGYD 
DDGKPKlPGTLNKGlGGEHGIDKFNPAGViQNRKDKNTTSLDQNPELFAFNNEGINAPSSSGSKlANIYPtDSNGNPQDA 
QLERGLTPSPLVLRSAEEGUSIVNTNKEGENQRDLKVISREHHRGILNSKSNDAKGIKSSKLKVWGDLKWDGLIYNPRG 

60 REENAPESKDNQDPATKIRGQFEPIAEGQYFYKFKYRLTKDYPWQVSYIPVKIDhTTAPKlVSVDFSNPEKIKLrrKDTYHK 
VKDQYKNETLFARDQKEHPEKFDEIANEVWYAGAALVNEDGEVEKNLEVTYAGEGOGRNRKLDKDGNTIYEIKGAG 
DLRGKIffiVIAlX^SSNFTKIHRIKFANQADEKGMISYYLVDPDQDSSKYQKLGElAESKFKNLGNGKEGSLKKDTTGVE 
HHHQENEESIKEKSSFTIDRNISTIROFENKDLKKLIKKKFREVDDFTSETGKRMEEYDYKYDDKGNIIAYDDGTDLEYE 
TEKLDElKSKIYGVI^PSKDGHFEILGKISNVSKNAKVYYGNNYKSIEIKATKYDFHSKTMTFDLYANINDIVIXiLAFAG 

65 ' DMRLFVKDNDQKKAEIKIRMPEKIKETKSEYPYVSSYGNVIELGEGDLSKNKPDNLTKMESGKIYSDSEKQQYLLKDNII 
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LRKGYALKVTTYNPGKTDMLEGNGVYSKEDlAJaQKANPNLKAL^ETTIYADSRhfVEDGt^T^^ 
VFTF^MNDKGEAIDKDGKLVTDSSKLVLFGKDDKEYTGEDKFNVEAlKEDGSMLFIDTKPVNUMDKhnrFNPSKSKKr 
YVR^a»EFYLRGKISDKGG^NW£LRVNESVVDNYLIYGDLHID^rrRDFNIKLNVKDGDIMDWGMKDyKANGFPDKVTD 
MDGNVYLQTGYSDLNAKAVGVHYQFLYDNViCPEVNIDPKGNTSIEYADGKSVVFNINDKRKNGFDGEIQEQHIYINGK 
5 £YTSFND1KQIIDKTLNIKIVVKDFAR^^TVKEFIL^^CDTGEVSELKPHRVTVT!QNGKEMSST^^ 

GY0FDGWEISGFEGKKDAGYVINl^KIXn=IKPVFKKIEEKKEEENKPTFDVSKKKDNPOVNHSQLNESHRKEDLQREEH 
SQKSDSTKDVTATVLDKNNISSKSTTNNPNKLPKTGTASGAQTLLAAGIMFIVGIFLGLKKKNQDZ 

ID6 597 bp 

10 

CTTGAATTAAATAAAAAACGTCATGCGACTAAGCATTrrACrGATAAGCrrGTTGATCCCAAAGATGTGCG^^ 
CTATCGAAATTGCAACCTTAGCGCCAAGCGCCCACAACAGCCAGCCTTGGAAATTTGTGGTGGTACGTGAGAAAA 
ATGCTGAACTGGCAAAGTTAGOTATGGTTCCAArmGAACAGGTATCATCAGCGCCTGTAACCATTGCCrTGT^ 
TACAGATACGGACTTAGCCAAACGTGCrCGTAAGATrGCCCGTGTTGGTGGTGCTAATAACTmCTGAAGAGCAA 
15 CTTCAATATTTTATGAAAAATCTGCCAGCTGAGTTTGCCCGTTACAGTGAGCAACAAGTCAGCGACTACCTAGCT^ 
TCAATGCAGGTITGGTTGCCATGAACrrGGTrCTTGCATTGACAGACCAAGGAATTGGTTCTAACA™ 
TTTTGACAAATCAAAAGTTAATGAAGTmGGAAATCGAAGACCCmCCGCCCAGAACTCTrGATCACACT^ 
TATACAGACGAAAAATTGGAACCAAGCTACCGCTTGCCAG7AGATGAAATCATCGAGAAAAGATAG 

20 LELNKKRHATKHFTDKLVDPKDVRTAIEIATLAPSAHNSQPWKFVVVREKNAELAKLAYGSNFEQVSSAPVTIALFTDT 
DLAKRARKIARVGGANNFSEEQLQYFMKNLPAEFARYSEQQVSDYLALNAGLVAMNLVLALTDQGIGSmiLGFDKSK 
VNEVLEIEDRFRPELLirVGYTDEKLEPSYRLPVDEIIEKRZ 

[D7 1401 bp 

25 

ATGACAGCAATTGATTTTACAGCAGAAGTAGAAAAACGCAAAGAAGACCTCrrGGCTGA CrUUrri AGC Crm 
G^AATCAATTCAGAACGTGATGACAGCAAGGCTGATGCCCAGCATCCAmGGGCCTGGTCCAGTAAAAGCCTTG 
GAGAAATTCCTTGAAATCGCAGACCGCGATGGCTACCCAACTAAGAATGrrGATAACTATGCAGGACATnTGAG 
TTTGGTGATGGAGAAGAAGTTCTCGGAATCmGCCCATATGGATGTGGTGCCTGCTGGTAGCGGTTGGGACACAG 

30 ACCCTTACACACCAACTATCAAAGATGGTCGCCTTTATGCGCGCGGGGCTTCGGACGATAAGGGTCCTA 
CTTGTTACTATGGmGAAAATCATCAAAGAATTGGGTCTTCCAACTTCTAAGAAAGTTCGCT^ 
AGACGAAGAATCAGGCTGGGCAGACATGGACTACTAarTTGAGCACGTAGGACTrGCC^\AACCAGAmCGGm 
CTCACCAGATGCTGAATTTCCAATCATCAATGGTGAAAAAGGAAATATCACGGAATACCTCCACTTTGCAGGAGA 
AAATACAGGTGTTGCCCGTCTTCACAGCTTTACAGGTGGTTTACGTGAAAATATGGTACCAGAATCAGCAACAGC 

35 AGTCGTTTCAGGTGACTTGGCTGACrTGCAAGCTAAACrAGATGCCmGTTGCAGAACACAAAOTAGAG 

ACTCCAAGAAGAAGCTGGCAAATACAAGGTGACGATCATTGGTAAATCAGCCCACGGTGCrATGCCTGCTTCAGG 
TGTCAATGGCGCAAOTACCTTGCCCTCn-CCTCAGCCAGmCGCTrrGCTGGTCCAGCCAAAGACTACOT 
ATCGCAGGTAAAATTCTCnTGAACGATCATGAGGGTGAAAATCTTAAGATTGCTCATGTGGATGAAAAGATGGGT 
GCTCrrrCTATGAATGCCGGCGTCnTCCACrrCGATGAAACAAGTGCTGATAATACCATrGCCCTCAACA 

40 ATCCAAAAGGAACAAGTCCAGAACAAATCAAGTCAATCCTTGAAAACTrGCCAGTTGTTTCTGTTAGCCTGTCTGA 
ACACGGTCACACGCCTCACTATGTGCCAATGGAAGATCCACTTGTGCAAACmGTTGAATATCTATGAAAAACA 
AACTGGCTTTAAAGGTCATGAACAAGTCATCGGTGGTGGAACCTTTGGTCGCTTGCTAGAACGCGGAGTTG^^ 
CGGTGCrATGTTCCCAGACTCGATTGATACCATGCACCAAGCCAATGAAmATCGCCTTGGATGATCTTTrCCGA 
GCAGCAGCAATTTATGCCGAAGCTATTTACGAATTGATCAAATAA 

45 

MTAIDFTAEVEKRKEDLLADLFSLLEINSERDDSKADAOHPFGPGPVKALEKFLEIADRDGYPTKNVDKYAGHFEFGD 
GEEVLGIFAHMDVVPAGSGWDTDPYTPTIKDGRLYARGASDDKGPTTACYYGLKIIKELGLPTSKKVRFIVGTDEESGW 
ADMDYYFEHVGLAKPDFGFSPDAEFPIINGElCGNrrEYLHFAGENTGVARLHSFTGGLRENMVPESATAVVSGDLADL 
QAKLDAFVAEHKLRGELOEEAGKYKVTIIGKSAHGAMPASGVNGATYLALFLSQFGFAGPAKDYLDIAGKILLNDHEG 
50 ENLKIAHVDEKMGALSMNAGVFHFDETSADNTIALNIRYPKGTSPEOIKSILENLPVVSVSLSEHGHTPHYVPMEDPLVQ 
TLLNIYEKQTGFKGHEQVIGGGTFGRLLERGVAYGAMFPDSIDTMHQANEFIALDDLFRAAAIYAEAIYEUKZ 

ID8 1617 bo 

55 GTGTATACTATTATAAAATCAAATATAAAAAAATTrAGTTTATTAACGATAmATrGTTGCTGGTCAA 
AATTTATGCAGCAACTATTAATGCTCTGGTGTTGAATGAATTAATTGCGATGAATTTAGAGCGGTTm 
TCAATCTACCAAATGATTGTCTGGTGTGGGATAATATTCCTTGACTGGGTAGTGAAAAATTATCAGGTTGAAGTGA 
TCCAAGAGTTTAATCrAGAGATTCGAAATAGAGTTGCCACAGACATCTCTAACTCTACCT 
TAAATCATCAGGAACATATCrrrCGTGGCTAAATAATGATGTTCAGACTTTAAATGATCAGGCGTTTAAACy^ 

60 TTTrrAGTAATAAAAGGAATTTCTGGTAmTATTTGCAGTTGTCACTCTTAATCACT^ 

AGCCACCTTGTTTTCATTAATG ATTATGCT ACTTGTACCAAAAATCmGCATCGAAAATGCGAGAAGTTAGTCT 
AATrrAACTAACCAAAATGAAGCnTrmAAAATCTAGTGAGACTATATrGAATGGATTTGATG 
TGAATCTTTrATATGTATTGCCTAAGAAAArrAAAGAAGCAGGAATmATrAAAGATGGTTATACAAACAAAGA 
CAACTGTAG AAACGrrAGCAGGCGCTATOGCrrCTTTCTCAATA I'i i' I'IM'r I'CAG ATATCTCTCGTTTrTTTAACA 

65 GGCTATCTTGCAATAAAAGGAATAGTGAAAATTGGTACTATTGAAGCAATAGGAGCACTAACAGGTGTTATTTTT 
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ACAGCGCrAGGTGAATTAGGAGGTCAATTATCCTCTATTATTGGTACGAAGCCTATTTTTTTA^ 
TTAATCCAATTGAGTC^ATAAA-^iTGAATGATATCGAACCAAATGAGGTGAATAGAGATTTTCCGTTATATGAAG 
CAAAAAA TATn GCTATAAGTATGGAGATAAAGAAATATTAAAAAACTTAAATTTTTGTTTTCAAC^^ 
GTATTTAATTITAGGTGAAAGTGGAAGCGGGAAATCTACATTATTAAAATTATTGAATGGCTT^ 
D AGTGGAGAATTGCGATTCTGCGGGGATGATATAAAAAAAACCTCCTATTTAAATATGGTrrCGAATGTTCT 
TAGATCAAAAA GCTT ATrrGTTTGAAGGTACGATTAGAGATAATATTTTATrGGAAGAAAATTATACT^ 
AATACrACAGTCTTTAGAGCAAGTTGGTrrGAGTGTAAAAGATTTTCCTAATAACATTTTAGATT^^ 
GATGATGGGAGATrACTGTCAGGAGGGCAGAAACAAAAAATTACTTTAGCTAGAGGGCTAATTAGAAATAAGAA 
AATAGTATTAATTCACGAGGGAAOTCTGCTATCGATAGGAGAACnTCGTTAGCGATTGAACGTAAGATATrAGA 
lU TAGAGAGGATTTGACTGTCATTATTGTTACCCATGCTCCGCATCCGGAACTTAAACAATATTTTAC^ 
CAATTTCCAAAGGATTTTATTTAA 

MYTIIKSNIKKFSLLTinVAGQLLLIYAATINALVLNELIAMNLERFLKI^IYQMIVWCGnFLDWVVKNYQVEVIQEFNL 
EIJ^RVATDISNSTYQEFHSKSSGTYI^WLNNDVQTLNDQAFKQLFLVIKGISGTIFAVVTLNHYHWSLTVATLFSLMIM 
15 LLVPKIFASKMREVSLNLTNQNEAFLK^SETILNGFDVLASLNLLYVLPKKIKEAGILLKMVIQRKTTVETI^GAISF 

FFQISLVFLTGYl^IKGrVKIGTIEAIGALTGVIFrALGELGGQLSSnGTKPIFLKLYSINPIESNKMNDIEPNEVNRDFPLYE 
AKNICYKYGDKEILKNLNFCFQRNEKYLILGESGSGKSTLLKLLNGFLRDYSGELRFCGDDIKKTSYLNMVSNVLYVDQ 
KAYLFEGTIRDNILLEENYTDEEILQSLEQVGLSVKDFPNNILDYYVGDDGRLLSGGQKQKITLARGLIRNKKIVLIDEGT 
SAIDRRTSLA1ERKILDREDLTVIIVTHAPHPELKQYFTKIY0FPKDFI2 
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ID9 705 bp 



ATAACAGTTAAACAGATTATGGACGAAATAGCCGTTTCAGATATGACTGCAAGGCGCTATTTACAGGAATTAGCT 
GATAAAGATTTGCTGATTCGTGTGCATGGTGGAGCTGAAAAACTTCGAACCAACrCCCTTT^ 
ZD CAAATATTGAAAA ACAA GCCCTCCAAACGGCAGAAAAACAAGAAATAGCCCATTTTGCAGGCAGTCTAG 
GAAAGAGAAACTATTTTCATTGGACCAGGAACAACA7TAGAG 
GCGTCGTAACCAACAGTCTACCTGTrm 

AAATTATCGCGATATTACAGGTGCTTrTGrrGGTACATTGACCCTACAAAATCTCTCTAATCrCCAAT^ 
GCTT TCGTT AGCTGTAATGGTATTCAAAACGGAGCTCTAGCTACTTT^ 
iU ATCGCTTTAAATAATTCTAATAAAAAATATTTACTCGCAGATCATAGCAAGTTCAATAAGTTTGATr^ 
TTATAATGTATCAAATCTTGATACTATTGTTTCAGATTCTAAACTAAGTGATTCAATCCTTT^ 
ACATTAAAGTCATCAAGCCTTAA 

ETVKQIMDEIAVSDMTARRYLQELADKDLLIRVHGGAEKLRTNSLLTNERSNIEKQALQTAEKQEIAHFAGSLVEERETI 
53 FIGPGTTLEFFARELPIDNIRVVTNSLPVFLILSERKLTDLILIGGNYRDITGAFVGTLTLQNLSNLQFSKAFVSCNGIQNGA 
LATFSEEEGEAQRIALNNSNKKYLLADHSKFNKFDFYTFYNVSNLDTIVSDSKLSDSILFKLSKHIKVIKPZ 

ID 10 483 bp 

40 ATGACTGAGTTrrCGTTAGATCri'Cri'CTAGAAGCCATTAAACTAGCTCGTTGGACCTACT 

AGCTAGACAAAACAGATAAAGACCAAGAGCTrAAAACTGAAATTCAATCCATCTTTATCGAACACAAGGGAAATT 
ATGCTTATCGCCGGGTTCATTTAGAACTAAGAAATCGTGGTTATCTGGTAAATCATAAAAGAGTrCAAGGOr^ 
GAAAGTACTCAATTTACAAGCTAAAATGCGAAAGAAACGAAAATATTCTTCTCATAAAGGAGACGTTGGTAAGAA 
GGCAGAGAATCTCArrCAAGCCCAATTTGAAGGCTCTAAAACAATGGAAAAGTGCTACACAGATGTGACTGAATT 

45 TGCCATTCCAGCAAGTACTCAAAAGCTTTACTTATCACCAGTTTT^ 
AATCTTTCTTGTTCGCCTAATTTAGAATAA 

MTEFSLDLLLEAIKLARWTYYYHLKQLDKTDKDQELKTEIQSIFIEHKGNYAYRRVHLELRNRGYLVNHiCRVQGLMK 
50 ^^^LQAKMRKKRKYSSHKGDVGKKAENLIQAQFEGSKTMEKCYTDVTE^ 

ID14 1266 bp 

CCAGGATTTGGTACCGTTGCAAGTGGTGTGCCTTTCCTCCTA.AAGGAAAATGGAGGAAAAATCAATCAATCAGCA 

CA TTCA GA TATC AAAGTTGCTAAGGTATTGGTCAAGGATGAAGATGAAAAAAATCGCTTGCTTGCAGCAGGGAAT 

GACTTTAACTTTGTAACCAATGTGCATGATATTrTATCAGACCAGGATATTACTATCGTAGTGGAAT^^ 

GTATTGAGCCTGCTAAAACCTTTATCACTCGTGCCrrGGAAGCTGGAAAACACGTTGTTACTGCTAACA^^ 

TTTAGCTGTCCATGGCGCAGAATTGC TAGA AATCGCTCAAGCTAACAAGGTAGCACnTTACTACGAAGCAGCAGT 

TGCTGGTGGGATTCCAATTCTTCGTACTTTAGCAAATTCCrrGGCTTCTGATAAAATTAC^ 

OU GTCAACGGAACTTCCAACrTCATGGTGACCAAGATGGTGGAAGAAGGCTGGTCTTACGATGATGCTCTTGCGGA 

G CACA ACGTCTAGGAT TTGCA GAAAGCGATCCGACGAATGACGTAGATGGGATTGATGCAGCCTACAAGATGGTT 
ATTTTGAGCCAATTTGCCTTTGGCATGAAGATTGCCTTTGATGATGTAGCCCACAAGGGAATCCGO^ 
CAGAAGACGTAGCTGTAGCTCAAGAGCTTGGrrACGTAGTGAAATTGGTTGGTTCTATTGAGGAAACTTCTT 
"^ATTGCTGCAGAAGTGACTCCAACCTTCCTACCTAAAGCGCACCCACTTGCTAGTGTGAATGGCGTAATGAACGCT 

05 GTCTTTGTAGAATCTATCGGTATTGGTGAGTCrATGTACTACGGACCAGGTGCGGGTCAAAAACCAACTGCAACA 
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AGTGTTGTAGCrGATATTGTCCGTATCGTTCGTCGTTTGAATGATGGTACTATTGGCAAAGACTTCAACGAATATA 
GCCGTGACTTGGTCTTGGCAAATCCTGAAGATGTCAAAGCAAACTACTATTTCTCAATCTTGGCTCTA 
AGGTCAGGTCTTGAAGTTGGCTGAAATCTTCAATGCTCAAGATATrrCCTTTAAGCAAATCC^ 
GAGGGTGACAAGGCGCGTGTCGTTATCATCACACACAAGATTAATAAAGCCCAGCTTGAAAATGTCTCAGCTGAA 
5 TTGAAGAAGGTTTCAGAATTCGACCTCTTGAATACCTTCAAGGTGCTAGGAGAATAA 

PGFGTVASGVPFLLKENGGKINQSAHSDIKVAKVLVKDEDEKNRLLAAGNDFNTVTbTVDDIL^DQDrriVVELMGRIEP 
AKTmIULEAGKHVVTA^^CDLLAVHGAELLEiAQANKVALYYEAAVAGGIPILRTLANSLASDK^^RVLGVVN^ 
MVTKMVTEGWSyDDALAEAQRLGFAESDPTNDVDGIDAAYKMVILSQFAFGMKlAFDDVAHKGlRNITPEDVAVAQE 
10 LGYVVKLVGSIEETSSGIAAEVTPTFLPKAHPLASVNGVMNAVFVESIGIGESMYYGPGAGQKPTATSVVADIVRTVn^ 
KDGTIGKDFHEYSRDLVLANPEDVKANYYFSILALDSKGQVLKL/lEIFNAQDISFKQILQDGKEGDKARVVIITHKINKA 
QLENVSAELKKVSEFDLLNTFKVLGEZ 

ID16 1725 bp 

15 

ATGAAACACCTATTATCTTACTTCAAACCCTACATCAAGGAATCAATTTTAGCCCCCTTGTTCAAGCTGTTAG 
CTGi 1 1 1 iGAGCTCTTGGTTCCCATGGTGA TTGC TGGGATTGTTGACCAATCTTTACCTCAGGGAGATCAAGGTCA 
TCrCTGGATGCAGATTGGCCTGCTCCTTATCTTrGCAGTAATTGGCGTTTTAGTGGCCTTGATAG 
CAGCAAAGGCAGCAGTAGGTTCTGCTAAGGAATTGACAAACGATCTTTATCGTCATATTCTrTCCTTGCCCAAGGA 

20 CAGCAGAGACCGTCTGACAACTTCTAGTTTGGTCACTCGCTTGACTTCGGATACCTACCAGATTCAGACTGGTATC 
AATCAATTCCTGCGTCTCTTTTTACGAGCGCCCATTATCG i ' l ' tTl 'GGTGCCA ri ' i i ' l ATGGCTTATCGAATCTCAGC 
TGAGTTGACTTTCTGGTTCTTAGTCTTGGTTGCCATTTTGACCATTGTCATTGTAGGGTTATCrC 
' CTTTCTACAGTAGTCTCAGAAAGAAAACGGACCAACTGGTTCAGGAAACGCGCCAGCAATTGCAAGGGATGCGGG 
TTATTCGTGCrrrrGGTCAAGAAAAACGAGAGTTACAGATTTTTCAAACCCTTAACCAAGT^ 

25 AGAAAAGACAGGTTTCTGCTCTAGTTTATTAACACCTCTGACCTATCTGATTGTCAATGGAACTCTTCTCGTTAT^ 
ATCTGG CAAG GCTATATTTCAATTCAAGGAGGAGTGCTCAGTCAAGGTGCTCTCATTGCTCTTATCAATTACCT 
TACAGATTTTGGTGGAATTGGTCAAGCTAGCCATGTTGATCAATTCCCTCAACCAGTCCTAT^ 
AA TCGA GGAAGTCTTrGTTGAGGC TCCA GAGGATATCCATTCAGAGTTAGAACAAAAGCAAGCTACCAGAGATAA 
GGTmACAAGTCCAAGAATTGACCTrrACCTATCCTGATGCGGCCCAGCCrrCTCTGA 

30 AT GACT CAAGGACAAATTCTAGGTATCATCGGGGGAACTGGTTCTGGTAAATCAAGCTTGGTGCAACTCTTACTTG 
GACTTTATCCAGTAGACAAGGGGAACATTGACCnTATCAAAATGGACGTAGTCCTCTTAATTTGGAGCAG^ 
GGTCTTGGATTGCCTATGTACCTCAAAAGGTCGAACTCTTTAAAGGAACCATTCGTTCCAACTTGACTCTAG 
CAATCAAGAAGTATCTGACCAGGAACTCTGGCAGGCCTTGGAGATTGCGCAAGCTAAGGATTTTGTCAGTGAAAA 
GGAAGGACTCTTGGATGCTCTAGTTGAGGCAGGGGGGCGAAATTrCTCAGGTGGACAAAAACAAAGATTGTCTAT 

35 CGCCCGAGCAGTCTTGCGCCAGGCTCCCTrrCTCATCCTAGATGATGCAACCTCGGCACTGGATACCATTACAGAG 
TCCAAGCTCTTGAAAGCTATTAGAGAAAATTTTCCAAACACGAGCTTAATTTTGATCTCTCAACGAACCT 
TACAGATGGCGGACCAGATTCTCCTCTTGGAAAAAGGTGAGTTGCTAGCTGTTGGCAAGCACGATGACTTGATGA 
AATCCAGCCAAGTCTATTGTGAAATCAATGCATCCCAACATGGAAAGGAGGACTAG 

40 MKHLI^YFKPYIKZSILAPLFKLLEAVFELLVPMVIACrVDQSLPQGDQGHLWMQIGLLLIFAVIGVLVALIAQFYSAKA 
AVGSAKELTNDLYRHILSLPKDSRDRLTTSSLVTRLTSDTYOIQTGINQFLRLFLRAPIIVFGAIFMAYRISAELTFWFLVL 
VAlLTIVrVGLSRLVNPFYSSLRKKTDQLVQETRQQLQGMRVIRAFGQEKRELQIFQTLNQVYARLQEKTGFWSSLLTPL 
TYLIVNGTLLVIIWQGYISIQGGVLSQGALIALINYLLQILVELVKLAMLINSLNQSYISVKRIEEVFVEAPEDIHSELEQKQ 
ATRDKVLQVQELTFTYPDAAQPSLRYISFDMTQGQILGIIGGTGSGKSSLVQLLLGLYPVDKGNIDLYQNGRSPLNLEQ 

45 WRSWIAYVPQKVELFKGTIRSNLTLGFNOEVSDQELWQALEIAQAKDFVSEKEGLLDALVEAGGRNFSGGQKQRLSIA 
RAVLRQAPFLILDDATSALDTITESKLLKAIRENFPNTSLILISQRTSTLQMADQILLLEKGELLAVGKHDDLMKSSQVYC 
EINASQHGKEDZ 

ID18 1224 bo 

50 

atgaaacgttctctccactcaagagtcgattacagtttgctcttgccagtattttttctact 
ggctatctatatagccgtta gtca tgattatcccaataatattctgcccattttagggcagcaggtcgcctggatt 
g ccrrg gggcttgtgattggttttgtggtcatgctct ttaa tacagaa l 1 icl ' i 1 ggaaggtgaccccctttctata 
tattttaggcttgggacttatgatcttgccgattgtattttataatccaagcttagttgcatcaacgggt^ 
5d aactgggtatcaataaatggaattaccctattccaaccgtcagaatttatgaagatatcctatatcctcatgttgg 
ctcgtgtcattgtccaatttacaaagaaacataaggaatggagacgcacggt^ 

CTGGATGATTCTCTTTACCATTCCAGTC^ 

TAGCCATTTTCTCAGGAATCGTTTTA TTAT CAGGGGTTTCTTGGAAAATTATTATCCCAGTAT^ 
ACAGGAGTTGCTGGTTTC TTAG CTATCTTTATTAGCAAGGACGGACGAGC'i'ri'i C I'i CACCAGATTGGAATGCCGA 

60 CCTACCAAATTAATCGGATTTrGGCTTGGCTCAATCCCTTTGAGTTTGCCCAAACAACGACT^ 

AGGGCAGATTGCC ATTGG GAGTGGTGGCTTATTTGGTCAGGGATTTAATGCTTCGAATCTGCTTATCCCAGTTCG^ 
GAGTCAGATATGATTTTTACGGTTATTGCAGAAGATTrrGGCTTTATTGGCTCTGTCCT 
CATGTTGArTTACCGTATGTTG AAGA TTACTCTTAAATCAAATAACCAGTTCTACACTTATATTTCCACAC^ 
T TATG ATGTTGCTCTTCCACATCnTrGAGAATATCGGTGCrGTGACTGGACTACTTCCITTGA^ 

65 CCTTTCATTTCGCAAGGGGGATCAGCTATTATCAGTAATCTGATTGGTGTTGGTTTGCTTTTATCGATG^^ 



45 

GACTAATCTAGCTGAAGAAAAGAGCGGAAAAGTCCCATTCAAACGGAAAAAGGTTGTATTAAAACAAATTAAATA 
A 

MKRSLDSRVDYSLLLPVFFLLVIGVVAIYIAVSHDYPNNULPILGQQVAWlALGLVIGFWMLFOTEFLWKVTPFLmG 
5 GLMILPIVFYNPSLVASTGAK^WVSlNG^rLFQPSEFMKlSYILMLARVIVQF^KKHKEWRRTVPLDFLLIFWMrLm 

VLLALOSDUGTALVFVAIFSGrVLLSGVSWKIIIPVFVTAVTGVAGFLAlFlSKDGRAFLHQIGMPTYQINRILAWLNPFEF 
AQTTTYQQAQGQlAIGSGGLFGQGFNASNLLIPVRESDMIFTVlAEDFGFIGSVLVlALYLMLIYRMLKrrLKSNNQFVTY 
ISTGLIMMLLFHIFENIGAVTGLLPLTGlPLPFISQGGSAlISNLIGVGLLLSMSYQTNLj\££KSGKVPFKRKKVVLKQiK2 

10 ID22 987 bp 

ATGGTGGCTAAGAAAAAAATCTTATTrrrTATGTGGTCTTTTTCTC^ 

CCATTGTTTCAAATCTGGATCCAGAAAAGTATGATATTGATATTCTTGAAATGGAGCACTTTGACAAG 
ATCTGTTCCAAAGCATG TACG CATTTTAAAATCCCTTCAAGATTATCG CCA AACC AGATGGTT ACG AGCTTTnTG 

15 TGGAGAATGAGAATTTATTTTCCAAGACTGACTCGTCGTTTGCTTGTAAAAGATGATTATGATCTTGAAC^^ 

rrACCATTATGAATCCACCACTGTTGTTCTCTAAAAGAAGAGAAGTCAAGAAGATATCTTGGATTCATGGAAGTAT 
TG AAGAACT T CI 1 AAGG ATAGCTCTAAAAG AG AATCACATAG AAGCCAGTTGGATGCTGCG AATACAATTGTAGG 
GATTTCAA AAAA GACCAGCAArrCrATCAAGGAAGTTTATCCAGATTATACTTCTAAATTACAGACAATCTACAAT 
GGATATGATTTTCAGACTATTCTAGAAAAATCrCAAGAGAAGATCGATATCGAGATTGCrCCTCAAAGTATCTGTA 

2U CTATCGGACGGATTG AGGAA AATAAGGGTTCTGACCGTGTAGTGGAAGTGATACGATTATTACACCAAGAGGGAA 
AAAACTATCATCTCTATTTTATCGGGGCTGGTGATATGGAAGAGGAACTGAAAAAACGAGTCAAAGAGTATGGGA 
TTGAGGACTATGTACAT TTCCT TGGTTATCAAAAAAATCCTTATCAGTATCTATCTCAGACGAAAGTTCTTTTGTC^ 
ATGTCTAAACAAGAAGGTTTTCCTGGAGTGTATGTGGAGGCCTTGAGTCTGGGACrCCCrTTTATCTCTACGGACG 
TTGGAGGGGCTGAGGAATTATCCCAAGAAGGACGATTTGGACAAATCATTGAGAGCAATCAAGAGGCAGCTCAG 

25 GCGATTACTAATTACATGACTTCTGCCTCAAACrrTGATGTCGATGAGGCTAGCCAATTCATTCAACAATTTA 
TTACAAAACAAATCGAACAAGTAGAAAAACTATTAGAGGAGTAG 

MVAKKKILFFMWSFSLGGGAEKIl^TIVSNLDPEKYDIDILEMEHFDKGYESVPKHVRILKSLQDYRQTRWLRAFLWIU^ 
RIYFPRLTRRLLVKDDYDVEVSFTIMNPPLLFSKRREVKKISWIHGSIEELLK^SSKRESHRSQLDAAOTrVGISKKTSNSIK 
iU EVYPDYTSKLQTIYNGYDFQTILEKSQEKIDIE1APQSICTIGRIEENKGSDRVVEVIR1.LHQEGKNYHLYFIGAGDMEEHL 
KKRVKEYGIEDYVHFLGYQKNPYQYLSQTKVLLSMSKQEGFPGVYVEALSLGLPFISTDVGGAEELSQEGRFGQIIESNQ 
EAAQAITNYMTSASNFDVDEASQFIQQFTITKQIEQVEiaLLEEZ 
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ID23 1434 bp 



ATGGAAACTGCATTAATTAGTGTGATTGTGCCAGTCTATAATGTGGCGCAGTACCTAGAAAAATCGATAGCTTCCA 
TTCAGAAGCAGACCTATCAAAATCTGGAAATTATTCrTGTTGATGATGGTGCAACAGATGAAAGTGGTCGCTTGTG 
TGATTCAATCGCTGAACAAGATGACAGGGTGTCAGTGCTTCATAAAAAGAACGAAGGATTGTCGCAAGCACGAAA 
TGATGGGATGAAGCAGGCTCACGGGGATTATCrGATTTTTATTGACTCAGATGATTATATCCATCCAGAAATGATT 

4U CAGAGCTTATATGAGCAATTAGTTCAAGAAGATGCGGATGTTTCGAGCTGTGGTGTCATGAATGTCTATGCTAATG 
ATGAAAGCCCACAGTCAGCCAATCAGGATGACTATTTTGTCTGTGATTCTCAAACATTTCTAAAGGAATACCTCAT 
AGGTGAAAAAATACCTGGGACGATTTGCAATAAGCTAATCAAGAGACAGATTGCAACTGCCCTATCCTTTCCTAA 
GGGGTTGATTTACGAAGATGCCTATrACCATTTTGATTTAATCAAGTTGGCCAAGAAGTATGTGGTTAATACTAAA 
CCCrATTATTACTA TTTCC ATAGAGGGGATAGTATTACGACCAAACCCTATGCAGAGAAGGATTrAGCCTATATTG 

45 ATATCTACCAA AAGTT TTATAATGAAGTTGTGAAAAACTATCCTGACTTGAAAGAGGTCGCTTTTT^ 

CTATG CCCACT TCTTTATTCTGGATAAGATGTTGCTAGATGATCAGTATAAACAGTTTGAAGCCTATrCTCAGATT 
CATCGTTTTTTAAAAGGCCATGCCrrrGCrATTTCTAGGAATCCAATTTTCCGTAAGGGGA^ 

TGGCCCTATTCATAAATATTrCCTTATATCGATTCrrATTACTGAAAAATATTGAAAAATCTAAAAAATTACATTA 

50 ° 

METALISVIVPVYNVAQYLEKSIASIQKQTYQNLEIILVDDGATDESGRLCDSIAEQDDRVSVLHKKNEGLSQARNDGM 
KQAHGDYLIFIDSDDYIHPEMIQSLYEQLVQEDADVSSCGVMNVYANDESPQSANQDDYFVCDSQTFLKEYLIGEKIPG 
TlCNKLIKRQlATALSFPKGLIYEDAYYHFDLIKLAKKYVVNTKPYYYYFHRGDSnTKPYAEKDLAYIDIYQKFYNEVV 

KNYPDLKEVAFFRLAYAHFFILDKMLLDDQYKQFEAYSQIHRFLKGHAFAISRNPIFRKGRJUSALALFINISLYRFLLLK 
55 NIEKSKKLHZ 



!D24 735bp 

ATGAGAATCAAAGAGA AAACC AATAATATTAATGGAGGAATAAAAAATGTAAGTAAGCATTATGGTCATTCAATC 
ATTCTCAAAGATATAAATTTTGCACTTAACAAGGGTGAAATTGTTGGTCTAGCAGGGAGAAATGGAGTTGGTAAG 
AGTACGTTGATGAAAATTCTTGT TCAGA ATAATCAACCGACTTCAGGTAATATTATAAGCAGTGATAATGTTGGGT 
ATTTAATCGAAGAACCAAAATTATTTTTATCTAAAACAGGTTTAGAGAATITAAAATAm 

TGTTGACTAC AATC AAGAAAGATTTAGATGTTTGATCCAAGAGTTAGATTTGACTCAGTCTATTAATAAAAAAGTA 
AAGACCTATTCTTTGGGTACAAAACAAAAATTAGCTTTGCTTCTAACrCTCGTTACGGAACCtGAT^^ 
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TAGATGAACCGACT AATG GTTTAGATATTGAATCATCACAAATAGTTITAGCGGTTCTAAAAAAATTAGCT^ 
TGAAAATGTGGGAATTTTAATATCGAGTCATAAATTAGA,^GACATTG.\AGAAATITGTGAGAGAGTTCT^ 
GAGAACGGGCrrTTG ACAT rrCAAAAAGTAGGAAAAGATAGTCATA A ' ri ' lCi^ ' Gri ' l GAGATAG ClUUU ^ 
CTACAGATAGAGACATTTTCATTACCAAACAAGAATTTTGGGATATTGTTTAG 

5 

MRIKEKTNNlNGGIKhTVSKHYGHSULKDINFALNKGEIVGLAGRNGVGKSTLMKILVONNQPTSGNIISSDNVGYLmE 

KLFLSia*GLENLKYL5NLYGVDYNQERFRCLIQELDLTQSINKKVKTYSLGTKQKLAUXTLVTEPDILlLDEPTNGLDlE 

SSQIVi^VLKKLAI-HEWGILISSHKLEDIEEICERVLFLENGLLTFQKVGKDSHNFLFEIAFSSATDRDIFrrKQEFWDI^^ 

10 ID25 1704bD 

ATGACTGAATTAGATAAACGTCACC GCAG TAGCATTTATGACAGCATGGTTAAATCACCTAACCGTGCTATGCTTC 
GTGCGACTGGTATGACAGATAAGGACTTTGAAACATCGATTGTGGGAGTGATTTCGACTTGGGCGGAAAATACAC 
CATGTAACATTCACTTGCATGATTTCGGGAAACTGGCTAAAGAAGGTGTCAAATCTGCAGGCGCTTGGCCTGTAC 

15 AGTTTGGAACCATTACCGTAGCGGACGGGATCGCTATGGGAACGCCTGGTATGCGTTrCTCTCTAACATCTCGTGA 
CATCATCGCGGACTCCATCGAGGCGGCTATGAGTGGTCACAACGTGGATGCCrrCGTCGCTATCGGTGGCTGTGA 
CAAGAACATGCCTGGATCTATGATTGCTATTGCTAATATGGATATCCCAGCTATTTTCGCCTATGGTGGAACTATT 
GCACCGGGAAATCTTGATGGTAAAGATATCGACTTGGTTTCTGTCTrTGAAGGTATCGGAAAATGGAACCACGGT 
GACATGACAGCTGAGGACGTGAAACGTCTTGAATGTAATGCCTGCCCTGGCCCTGGTGGTTGTGGTGGTATGTAT 

20 ACTGCTAATACCATGGCAACTGCTATCGAAGTTCTAGGGATGAGTTTGCCAGGGTCATCCTCTCACCCAGCTGAAT 
CAGCTGATAAGAAAGAAGATATCGAAGCAGCAGGACGTGCTGTTGTTAAGATGTTGGAACTTGGTCTCAAACCAT 
CAGATATCTTGACTCGTGAAGCCTTTGAAGATGCTATCACTGTAACGATGGCTCTCGGTGGTTCTACAAACGCCAC 
TCTTCACTTGCTCGCCATTGCCCATGCCGCAAATGTTGACTTGTCACTTGAGGACTrCAATACGATTCA 
GTGCCTCACTTGGCCGACTTGAAACCATCTGGTCAGTATGTCTTCCAAGACCTCTACGAAGTCGGTGGTGTCCCTG 

25 CGGTTATGAAGT ATTT GTTGGCAAATGGrrTCCTTCACGGAGATCGCATCACATGTACTGGTAAGACTGTAGCTG 
AAACTTGGCTGACrrrGCAGACTTGACTCCAGGCCAAAAAGTTATCATGCCACTTGAAAATCCA/^ 
TGGTCCGCTTATCATCTTGAACG GGAA CCTTGCTCCTGACGGTGCAGTTGCCAAGGTATCAGGTGTTAAAGTGCGT 
CGTCACGTTGGGCCAGCTAAGG TCTTT GACTCAGAAGAAGATGCGATTCAGGCCGTTCTGACAGATGAAATCG^ 
GATGGCGATGTAGTCGTTGTTCGTTTTGTTGGACCTAAAGGTGGTCCTGGTATGCCTGAGATGCrATCACT^ 

30 AATGATrGTTGGTAAj^GGTCAGGGAGATAAGGTGGCCCTCTTGACGGACGGACGTTTCTCTGGTGGTACnTAT^ 

CTGGTTGTTGGACATATCGCTCCTGAAGCrCAGGATGGTGGACCAATTGCCTATCTCCGTACCGGCGATATCGTTA 
CGGTTGACCAAG ATAC CAAAGAAATTTCTATGGCCGTATCCGAAGAAGAACTTGAAAAACGCAAGGCAGAAACA 
ACCTTGCCACCACTTTACAGCCGTGGTGTCCTCGGTAAATATGCCCACATCGTATCATCTGCTTCACGCGGAGCCG 
TGACAGACTTCTGGAATATGGACAAGTCAGGTAAAAAATAA 

35 

MTELDKRHRSSIYDSMVKSPNi^MLRATGMTDKDFETSIVGVlSTWAENTPCNIHLHDFGKLAKEGVKSAGAWPVQFG 
TITVADGIAMGTPGMRFSLTSRDUADSIEAAMSGHNVDAFVAIGGCDKNMPGSMIAIANMDIPAIFAYGGTIAPGNLDG 
KDIDLVSVFEGIGKWNHGDMTAEDVKRLECNACPGPGGCGGMYTANTMATAIEVLGMSLPGSSSHPAESADKKEDIE 
AAGRAVVKMLELGLKPSDILTREAFEDArrVTMALGGSTNATLHLLAlAHAANVDlSLEDFNTIOERVPHLADLKPSGQ 
40 YVFQDLYEVGGVPAVMKYLLANGFLHGDRITCTGKTVAENLADFADLTPGQKVIMPLENPKRADGPUILNGNLAPDG 
AVAKVSGVKVRRHVGPAKVFDSEEDAIQAVLTDEIVDGDVWVRFVGPKGGPGMPEMLSLSSMIVGKGQGDKVALLT 
DGRFSGGTYGLVVGHIAPEAQDGGPIAYLRTGDIVTVDQDTKEISMAVSEEELEKRKAETTLPPLYSRGVLGKYAHIVSS 
ASRGAVTDFWNMDKSGKKZ 

45 ID26 274bD 

ATGTTATAATAAAAATAAAGAATTrAAGGAGAAATACAATATGTCAATTTTTATTGGAGGAGCATGGCCATATCC 
AAACGGTTCGT TACAT ATTGGTCACGCGGCAGCGCTTTTACCGGGGGATATTCTTGCAAGATACTATCGTC^ 
GGGAGAGGAAGTTTTATATGTTTCTGGAAGTGATTGTAATGGAACCCCTATTTCTATCAGAGCTAAAAAAGA 
50 TAAGTCTGTGAAAGAAATTGCTGATTTTTATCATAAGGAATTTAATCCA 

CYNKNKEFKEKYNMSIFIGGAWPYANGSLHIGHAAALLPGDILARYYRQKGEEVLYVSGSDCNGTPISIRAKKENKSVK 
EIADFYHKEFNP 

55 ID28 1065bo 

ATGACAACATTATTTTCAAAAATTAAAGAAGTAACAGAACTTGCTGCAGTCTCAGGTCATGAAGCGCCTGTCCG 

GCrrATCTTCGTGAAAAGTTGACACCGCATGTGGATGAAGTGGTGACAGATGGCTTGGGTGGTATTTTTGGTATC^ 

AACATTCAGAAGCTGTGGATGCACCGCGCGTCTTGGTCGCTTCTCATATGGACGAAGTTGGTTTTATGGTCAGra^ 

OU AATCAAGCCAGATGGTACCTTCCGTGTCGTAGAAATCGGTGGCTGGAACCCCATGGTGGTTAGCAGCCAACCTTT 
CAAACTCTTGACTCGTGATGGTCATGAAATTCCTGTGATTTCAGGTTCTGTTCCTCCGCATTTGACT 
GGGGGACCAACCATGCCAGCCATTGCCGATATCGTTTTTGATGGTGGTTTTGCGGACAAGGCTGAGGCAGAAACT 
TTTGGCATCCGTCCTGGTGATACCATTGTACCAGATAGTTCTGCAJ^TTTTGACAGCCAATGAAAAAAATAT^^ 
CAAAAGCTTGGGATAACCGCTACGGTCTCCTCATGGTAAGCGAGCTAGCTGAAGCTTTATCGGGTCAAAAACTCG 

OD GCAATGAACTCTATCTGGGTTCTAACGTCCAAGAAGAAGTTGGTCTGCGTGGCGCTCATACCTCTACAACCAAGTT 



® 
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TGACCCAGAAGTCTTCCTCGCAGTTGATTGCTCACCAGCAGGTGATGTCTACGGTGGTCAAGGCAAGATTGGAGA 
TGGAACCTTGATTCGTTTCTATGATCCAGGTCACTTGCTTCTCCCAGGGATGAAGGATTTCC^^ 
GAAGAAGCTGGTATCAAGTACCAATACTACTGTGGTAAAGGCGGAACAGATGCAGGTGCAGCTCATCTGAAAAAT 
GGTGGTGTCCCATCAACA ACTA TCGGTGTCTGCGCTCGTTATATCCATTCTCACCAAACCCTCTATGCAATGGATG 
5 ACTTCCrAGAAGCGCAAGClJ'lClUACAAGCCTTGGTGAAGAAATTGGATCGTTCAACGGTTGATTTGATTAAACA 
TTATTAA 

MTTLFSKIKEVTELAAVSGHEAPVRAYLREKLTPHVDEVVTDGLGGIFGIKHSEAVDAPRVLVASHMDEVGFMVSEIK 
DGTFRWEIGGWNPMVVSSQRFKLLTRDGHEIPVISGSVPPHLTRGKGGPTMPAIADIVFDGGFADKAEAESFGIRPGDT 
10 IVPDSSAlLTA>mKNIISKAWDNRYGVLMVSEI^EALSGQKLGNELYLGShrVQEEVGLRGAHTSTTKFDPEVFLAVDCS 
PAGDVYGGQGKIGDGTURFyDPGHLLLPGMKDFLLTTAEEAGIKYQYYCGKGGTDAGAAHLKNGGVPSTriGVCARY 
IHSHQTLYAMDDFLEAQAFLQALVKKLDRSTVDLIKHYZ 

ID31 1182bD 

15 

ATGGAATTTTCTATGAAATCAGTCAAAGGACTACTCTTTATCATAGCTAGTTTTATOT 
GAACACTTCrCCCCAATTCATGATTCCAGGACrAGCTTTAACAAGCCTATCTCTGACTn^ 
CTCCCACTACTAG AAAG CTGG TTTCA CAGTTTGGAGAAGGTCTACACCGTCCACAAATTCACAGCCnTTCTCT 
TCATCCTACTA ATCT rrCATAACTTTAGTATGGGCGGTTTGTGGGGCTCTCGCTTAGCT^ 
20 GCCATCTATATCTTTGCCAGCATCATC CTTG TCGCCTATTTAGGCAAATACATCCAATACGAAGCTTGGCGATGGA 
TTCACCGCCTGGTTTACCTAGCCTATATTTTAGGACTCT^ 

TTTAATCTTCTAAG'rrrrcn G'i'l GGTAGCTATGCC C 1 1 i ' l AGGCTTACTAGCTGG' t ' lUUU ATATCATTTTTCTATAT 

CAAAAGATTTCCTTCC CCTA TCTAGGGAAAATTACCCATCTCAAACGCTTAAATCACGATACTAGAGAAATTCAA 

ATCCATCTTAGCA GACCT TTCAACTATCAATCAGGACAATTTGCCTTTCTAAAGATTTTCC 

25 GTGCTCCGCATCCCTTTTCTATCTCAGGAGGTCATGGTCAAACTCTTTACTTTACTG^ 

TACCAAGAATATCTATGATAATC TTCA AGCCGGCAGCAAAGTAACCCTAGACAGAGCTTACGGACACATGATCAT 
AGAAGAAGGACGAGAAAATCAGGTTTGGATTGCTGGAGGTATTGGGATCACCCCC^ 
ACATCCTATTTTAGATAAACAGGTTCACT TCTA CTATAGCTTCCGTGGAGATGAAAATGCAGTCT^ 
CTCCGTAACTATGCTCAGAAAAATCCTAATTTTGAACTCCATCTAATCGACAGTACGAAAGACGGCTATC^ 

30 TTGAACAAAAAGAAGTGCCCGAACATGCAACCGTCTATATGTGTGGTCCTATTTCTATGATGAAGGCACTTGCCA 
AACAGATTAAGAAACAAAATCCAAAAACAGAGCATATTTAC 

MEFSMKSVKGLLFIIASFILTLLTWMOTSPQFMIPGLALTSI^LTHIJ^TRLPLLESWFHSLEKVYTVHK^ 
NFSMGGLWGSRLAAQFCNLAIYIFASIILVAYLGKYIQYEAWRWIHRLVYLAYILGLFHIYMIMGNRIXTFNLLSFLVGS 
35 YALLGLLAGFYIIFLYQKISFPYLGKITHLKRLNHDTREIQIHLSRPFNYQSGQFAFLKIFQEGFESAPHPFSISGGHGQTXY 
FTVKTSGDHTKNIYDNLQAGSKVTLDRAYGHMIIEEGRENQVWIAGGIGITPFISYIREHPILDKQVHFYYSFRGDENAV 
YLDLLRNYAQKNPNFELHLIDSTKDGYLNFEQKEVPEHATVYMCGPISMMKALAKQIKKQNPKTEHIY 

ID32 900bD 

40 

ATG AC n ■ 1 ■ 1 A A ATC AGGCTTTGTAGCC ATTTTAGG ACGTCCCAATGTTGGG AAGTCAACCTTTTTAA ATCACGT^ 
TGGGGCAAAAGATTGC CATC ATGAGTGACAAGGCGCAGACAACGCGCAATAAAATCATGGGAATTTACACGACTG 
ATAAGGAGCAAATTGTCTTTATCGACACACCAGGGATTCACAAGCCTAAAACAGCTCTCGGAGATTTCATGGTTG 
AGTCTGCCTACAGTACCCTTCGCGAAGTGGACACTGTTCTTTTCATGGTGCCTGCTGATGAAGCGCGTGGTAAGGG 
45 GGACGATATGATTATCGAGCGTCTCAAGGCTGCCAAGGTTCCTGTGATTTTGGTGGTGAATAAAATCGATAAGGTC 
CATCCAGACCAGCTCTTGTCTCAGATTGATGACTTCCGTAATCAAATGGACTTTAAGGAAATTGTTCCAATCTCAG 
CCCTTCAGGGAAATAACGTGTCTCGTCTAGTGGATATTTTGAGTGAAAATCTGGATGAAGGTTrCCAATAT^ 
GTCTGATCAAATCACAGACCATCCAGAACGTTTCrrGGTITCAGAAATGGTTCGCGAGAAAGTCTrGC^\CCT 
CGTGAAGAGATTCCGCATTCTGTAGCAGTAGTTGTTGACTCTATGAAACGAGACGAAGAGACAGACAAGGTTCAC 

50 atccgtgcaaccatcatggtcgagcgcgatagccaaaaagggattatcatcggtaaaggtggcgctatgcttaag 
aaaatcggtagcatggcccgtcgtgatatcgaactcatgctaggagacaaggtcttcctagaaacctgggtcaag 
gtcaagaaaaactggcgcgataaaaagctagatttggctgactttggctataatgaaagagaatactaa 

MTFKSGFVAILGRPNVGKSTFLNHVMGQKIAIMSDKAQTTRNKIMGIYTTDKEQIVFIDTPGIHKPKTALGDF 
55 TLREVDTVLFMVPADEARGKGDDMUERLKAAKVPVILVVNKIDKVHPDQtLSQIDDFRNQMDFKEIVPISALQGNNfVS 
RLVDILSENLDEGFQYFPSDQITDHPERFLVSEMVREKVLHLTREEIPHSVAVVVDSMKRDEETDKVHIRATIMVERDSQ 
KGIIIGKGGAMLKKIGSMARRDIELMLGDKVFLETWVKVKKNWRDKKLDLADFGYNEREYZ 

ID33 8S5bD 

60 

CTGCTTCTTGTTTTTACAGAAGGAGGACTTATGCCTGAATTACCTGAGGTTGAAACCGT^ 

AATTGATTATAGGAAAGAAGATTTCGAGTATAGAAATTCCCTACCCCAAGATGATTAAGACGGATTTGGAAGAGT 

TTCAAAGGGAATTGCCTAGTCAGATTATCGAGTCAATGGGACGTCGTGGAAAATATTTGCTTTTTTATCT 

CAAGGTCTTGATTTrCCATTTGCGGAT^ 

65 GCCCATGr 1 UCi 1 iCATTTTGAAGATGGTGGCACGCTTGTTTATGAGGATGTTCGCAAGTTTGGAACCATGGAAC 
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TCrrGGTGCCTGACCrmAGACGTCTACrrTATTTCTAAAAAArrAGGTCCT^ 

TTTACAGGTCTTTCAATCTGCCCTTGCCAAGTCCAAAAAGCCTATCAAATCCCATCTCCTAGACCAGACC^ 
GCTGGACrrGGCAATATCTATGTGGATGAGGrrCTCTGGCGAGCTCAGGTTCATCCAGCTAGACCTTCCC^ 
TGACAGCAGAAGAAGCGACTGCCATTCATGACCAGACCATTGCTGTTTTGGGCCAGGCTGTTGAAAAAGGTGGCT 
5 CCACCATTCGGACrrATACCAATGCCTTTGGGGAAGATGGAAGCATGCAGGACTTTCATCAGGTCTATGATAAGA 
CTGGTCAAGAATGTGTACGCTGTGGTACCATCATTGAGAAAATTCAACTAGGCGGACGTGGAACCCAC'lTl'lGTCC 
AAACTGTCAA AGG AGGG ACTG A 

MLLVFTEGGLMPELPEVETVOlGLEmiGKiaSSrEIRYPKMIKTDLEEFOIUELPSQnESMG^ 
10 RMEGKYFYYPDQGPERKHAHVFFHFEDGGTLVYEDVWCFGTMELLVPDLLDVYFISKKLGPEPSEQDFDLQVFQSALA 
KSKKPIKSHLLDQTLVAGLGNIYVDEVLWRAQVHPARPSQTLTAEEATAIHDQTIAVLGQAVEKGGSTIRT^ 
GSMQDFHQVYDKTGQECVRCGTnEKIQLGGRGTHFCPNCQRRDZ 

1034 633bD 

15 

ttgtccaaactgtcaaaggagggactgatgggaaaaatcatcggaatcactgggggaattgcctctggtaagtca 
actgtgacaaattttctaagacagcaaggctttcaagtagtggatgccgacgcagtcgtccaccaactacagaaa 

CCTGGTGGTCGTCTGTTTGAGG CTCTA GTACAGCACTTTGGGCAAGAAATCATTCTTGAAAACGGAGAACTCA^ 
GCCCTCTCCTAGCTAGT CTCAT CTTTTCAAATCCTGATGAACGAQAATGGTCTAAGCAAATTCAAGGGGAGATTAT 
20 CCGTGAGGAACrGGCTACTTTGAGAGAACAGTTGGCTCAGACAGAAGAGATTTTCTTCATGGATATTCCCCT 

TTTGAGCAGGACTACAGCGATTGGTTTGCTGAGACTTGGTTGGTCTATGTGGACCGAGATGCCCAAGTGGAACGC 
TTAATGAAAAGGGACCAGTTGTCCAAAGATGAAGCTGAGTCrCGTCrrGGCAGCCCAGTGGCCTTTAGAAAAAAAG 
AAAGATTTGGCCAGCCAGGTTCTTGATAATAATGGCAATCAGAACCAGCTTCTTAATCAAGTGCATATCCTTOT 
AGGGAGGTAGGCAAGATGACAGAGATTAA 

25 

MSKI^EGLMGKIIGITGGlASGKSTVTNFLRQQGFQVVDADAVVHQLQKPGGRLFEALVQHFGQEia.£NGELNRPLL 

ASLIFSNPDEREWSKQIQGEUREEUATLREQLAQTEEIFFMDIPLLFEQDYSDWFAETWLVYVDRDAQVERLMKRDQLS 

KDEAESRLAAQWPLEKKKDLASQVLDNNGNQNQLLNQVHILLEGGRQDDRD2 

30 ID3S l269bD 

TTGATAATAATGGCAATCAGAACCAGCTTCTTAATCAAGTGCATATCCnTCn'GAGGCAGGTAGGCAAGATGACA 

GAGATTAACTGGAAGGATAATCTGCGCATTGCCTGGTTTGGTAATTrrCTG 

TACCrrTTATGCCCATCTTCGTGGAAAATCTAGGTGTAGGGACTCAGCAAGTCGCTTT^ 

35 TTCTGTCTCTGCTATTTCCGCGGCGCTCITrrCTCCTATTTGGGGTATTCTTG 

TGAT GATTCGGGCAGGTCrrGCTATGACTATCACTATGGGAGGCrrGGCCTTrGTCCCAAATA^ 
Cri'rCinCGTTTACTAAACGGTGTATTTG CAGG rri'i'OrrCCrAATGCAACGGCACn'GATAGCCAGTCAGGT^ 
AAGGAGAAATCAGGCTCTGCCTTAGGTACTTTGTCTACAGGCGTAGTTGCAGGTACTCTAACTGGTCCCTT^ 
GTGGCnTATCGCAGAATTATTTGGCArrCGTACA OU i ' itl i ACTGGTTGGTAG rn TCTAT 1 1 11 AGCTGCTATTT 

40 TGACTATTTGCTTTATCAAGGAAGATTTTCAACCAGTAGCCAAGGAAAAGGCTATTCCAACAAAGGAATT 
CTCGGTTAAATATCCCTATCTmGCTCAATCTCTTTTTAACCAGTTTTGTCATCCAA 
GCCCTATTTTGGCTCTTTATGTACGCGACTTAGGGCAGACAGAGAATCTrCTTTTTGTCTCT 

AGTATGGGCTTTTCCAGCATGATGAGTGCAGGAGTCATGGGCAAGCTAGGTGACAAGGTGGGCAATCATCGTCTC 
TTGGTTGTCGCCCAGTTTrATTCAGTC^^ 
45 CTATCGTTTCCTCTTTGGATr GGGA ACCGGTGCCTTGATTCCCGGGGTTAATGCCCTACTCAGCAAAATGACT 
AAAGCCGGCATTTCGAGGGTCTTTGCCTTCAATCAGGTATTCrrrrATCTGGGAGGTGTTGT^ 
GTTCTGCAGTAGCAGGTCAATTTGGCTACCATGCTGTCTrrTATGCGACAAGCCTITGTGTTGCCTl^ 
TTTAACCTGATTCAATTTCGAACATTATTAAAAGTAAAGGAAATCTAG 

50 MIIMAIRTSFUKCISFUIEVGKMTEINWKDNLRIAWFGNFLTGASISLVVPFMPIFVENLGVGSQQVAFYAGLAISVSAIS 
AALFSPIWGILADKYCRKPMMIRAGLAMTrrMGGlJiiFVPNIYWUFLRtLNGVFAGFVPNATAUASQVPKEKSGSALG 
TLSTGVVAGTLTGPFIGGFIAELFGJRTVFlXVGSFLFLAAILTICFIKEDFQPVAKEKAIPTKELFTSVKYPYLU-NU^m 
FVlQFSAQS!GPllj\LYVRDLGQTENLLFV'SGLIVSSMGFSSMMSAGVMGiaGDKVGNHRI-LVVAQFYSVnYLLCANAS 
SPLQLGLYRFLFGLGTGAUPGVNALLSKMTPKAGISRVFAFNQVFFYLGGVVGPMAGSAVAGQFGYHAVFYATSLCV 

55 AFSCLFNLIQFRTLLKVKEIZ 
1036 nilbp 

ATGGCCCTACCAACTATTGCCATTGTAGGACGTCCCAATGTTGGGAAATCAACCCTATrrAATCGGATCGCTGGTG 
AGCGAATCTCCATTGTAGAAGATGTCGAAGGAGTGACACGTGACCGTATTTATGCAACGGGTGAGTGGCTCAATC 

60 GTrCTTTTAGCATGATTGATACAGGAGGAATTGATGATGTCGATGCTCCTTTCATGGAACAAATCAAGCACCAGGC 
AGAAATTGCCATGGAAGAAGCAGATGTTATCCTTTTTGTCGTGTCTGGTAAGGAAGGAATTACTGATGCAGACGA 
ATACGTAGCTCGTAAGCmATAAGACCCACAAACCAGTTATCCTCGCAGTCAACAAGGTGGACAACCCTGAGAT 
GAGAAATGATATATATGATTTCTATGCTCrCGGTTTGGGTGAACCATTGCCTATCTCATCTGTCCATGGAATCGGT 
ACAGGGGATGTGCTAGATGCGATCGTAGAAAATCTTCCAAATGAATATGAGGAAGAAAATCCAGATGTCATTAAG 

65 TTTAGCTTGATTGGTCGTCCTAACGTTGGAAAATCAAGCTTGATCAATGCTATCTTGGGAGAAGACCGTGTTATT^ 
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CTAGTCCTGTTGCTGGAACAACTCGTGATGCCATTGATACCCACTTTACAGATACAGATGGTCAAGAGTTTACCAT 
GATrGATACGGCTGGTATGCGTAAGTCTGGTAAGGTTTATGAAAATACTGAG*J^.ATACTCTGTTATGCGTGCCATG 
CGTGCTATTGACCGTTCAGATGTGGTCTTGATGGTCATCAATGCGGAAGAAGGCATTCGTGAGTACGACAAGCGT 
ATCGCAGGATTTGCCCATGAAGCTGGTAAAGGGATGATTATCGTGGTCAACAAGTGGGATACGCTTGAAAA AGAT 
5 AACCACACTATGAAAAACTGGGAAGAAGATATCCGTGAGCAGTTCCAATACCTGCCTTACGCACCGATTATCTrr 
CTATCAGCTTTAACCAAGCAACGTCTCCACAAACTTCCTGAGATGATTAAGCAAATCAGCGAAAGTCAAAATACA 
CGTATTCCATCAGCTGTCTTGAACGATGTCATCATGGATGCCATTGCCATCAACCCAAC ACCG ACAGACAAAGGA 
AAACGTCTCAAGATTTTCTATGCGACCCAAGTGGCAACCAAACCACCAACCITTGT CATC TTTGTCAATG^ 
AACTCATGCA Cl H ICi ACCTGCGTTTCTTGGAAAATCAAATCCGCAAGGCCTTTGTTTTTGAGGGAACACCGAT 
1 0 TCATCTCATCGCAAGAAAACGCA AATAA 

MALPTIAIVGRPNVGKSTLFNRIAGERISIVEDVEGVTRDRIYATGEWLNRSFSMIDTGGIDDVDAPFMEQIKHQAEIAM 
EEADVTVFVVSGKEGITDADEYVARKLYKTHKPVILAVNKVDNPEMRNDIYDFYALGLGEPLPISSVHGIGTGDVLDAI 
VENLPNEYEEENPDVIKFSLIGRPNVGKSSLINAILGEDRVIASPVAGTTRDAIDTHFTDTDGQEFTMIDTAGMRKSGKV 
15 YEhTTEKYSVMRAMRAlDRSDVVLMVINAEEGIREYDKRlAGFAHEAGKGMIIVVNKWDTLEKDNHTMKNWEEDIR^^ 
FQYLPYAP^FVSALTKQRLHKLPEMIKQIS£SQ^^^UPSAVLNDVIMDAlAmPTPTDKGKRLKIF/ATQVATKPF^FVff^ 
NEEELMHFSYLRFLENQIRKAFVFEGTPIHLIARKRKZ 

ID37 714hp 

20 

ATGACAGAAACCATTAAATTGATGAAGGCTCATACTTCAGTGCGCAGGTTTAAAGAGCAAGAAATTCCCCAAGTA 
GACTTAAATGAGATTTTGACAGCAGCCCAGATGGCATCATCTTGGAAGAATTTCCAATCCTACTCr^ 
TACGAAGTCAAGAGAAGAAAGATGCCTTGTATGAATrGGTACCTCAAGAAGCCATTCGCCAGTCTGCTGTTTTCCT 
TCTCTrrGTCGGAGATTTGAACCGAGCAGAAAAGGGAGCCCGACTTCATACCGACACCTTCCAACCCCAAGGTGT 

25 GGAAGGTCTCrTGATTAGTTCGGTCGATGCAGCTCTTGCTGGACAAAACGCCTTGTTGGCAGCT^ 

TATGGTGGTGTGATTATCGGTTTGGrrCGATACAAGTCTGAAGAAGTGGCAGAGCTCTTTAACCTACCTGACTACA 
CCTATTCTGTCTTTGGGATGGCACTGGGTGTGCCAAATCAACATCATGATATGAAACCGAGACTGCCACTAGAGA 
ATGTTGTCTTTGAGGAAGAATACCAAGAACAGTCAACTGAGGCAATCCAAGCTTATGACCGTGTTCAGGCTGACT 
ATGCTGGGGCGCGTGCGACCACAAGCTGGAGTCAGCGCCTAGCAGAACAGTTTGGTCAAGCTGAACCAAGCTCAA 

30 CTAGAAAAAATCTTGAACAGAAGAAATTATTGTAG 

MTETIKLMKAHTSVRRFKEQEIPQVDLNEILTAAQMASSWKNFQSYSVTVVRSQEKKDALYELVPQEAIRQSAVFLLFV 
GDLNRAEKGARLHTDTFQPQGVEGLLISSVDAALAGQNALLAAESLGYGGVIIGLVRYKSEEVAELFNLPDYTYSVFG 
MALGVPNQHHDMKPRLPLENVVFEEEYQEQSTEAIQAYDRVQADYAGARATTSWSQRUAEQFGQAEPSSTRKNLEQK 
35 KLL2 

ID38 729bp 

ATGACAGAAATTAGACTAGAGCACGTCAGTTATGCCTATGGTCAGGAGAGGATTTTAGAGGATATCAACCTACAG 
40 GTGACTTCAGGCGAAGTGGTTTCCATCCTAGGCCCAAGTGGTGTTGGAAAGACCACCCTCTTTAATCTAATCGCT^ 
GGATTTTAGAAGTTCAGTCAGGGAGAATTGTCCTTGATGGTGAAGAAAATCCCAAGGGGCGCGTGAGTTATATGT 
TGCAAAAGGATCTGCTCTTGGAGCACAAGACGGTGCTTGGAAATATCATTCTGCCCCTCTTGATrCAAAAGGTGG 
ATAAGGCAGAAGCTATTTCCCGAGCGGATAAAATTCTTGCGACarrCCAGCTGACAGCTGTAAGAGACA AGTAT C 
CTCATGAACTTAGCGGTGGGATGCGCCAGCGTGTAGCCTTACTCCGGACCTACCTTTTTGGGCACA^ 
45 crrAGATGAGGCCTTTAGCGCCrrGGATGAGATGACAAAGATGGAACTCCACGCTTGGTATCnTGAGAT^^ 

GCAGTTGCAGCTAACAACCCTGATCATCACGCATAGTATTGAGGAGGCCCTCAATCTCAGCGACCGTATCTATATC 
TTGAAAAATCGCCGTGGGCAGA TTGT TTCAGAAATTAAACTAGATTGGTCTGAAGATGAGGACAAGGAAGTCCAA 
AAGATTGCCTACAAACGTCAAATTrrGGCGGAATTAGGCTTAGATAAGTAG 

50 MTEIRLEHVSYAYGQERILEDINLQVTSGEVVSILGPSGVGKTTLFNLIAGILEVQSGPJVLDGEHNPKGRVSYMLQKDLL 
LEHKTVLGNIILPLLIQKVDKAEAISRADKILATFQLTAVRDKYPHELSGGMRQRVALLRTYLFGHKLFLLDEAFSALDE 
MTKMELHAWYLEiHKQLQLTTLIITHSIEEALNLSDRIYILKNRPGQIVSEIKLDWSEDEDKEVQKIAYKRQILAELGLDK 
2 

55 1D39 2433bp 

ATGAACTATTCAAAAGCATTGAATGAATGTATCGAAAGTGCCTACATGGTTGCTGGACATTTTGGAGCTCGTTATC 
TAGAGTCGTGGCACTTGTTGATTGCCATGTCTAATCACAGTTATAGTGTAGCAGGGGCAACTTTAAATGA TTAT CC 
GTATGAGATGGACCGTTTAGAAGAGGTGGCTTTGGAACTGACTGAAACGGACTATAGCCAGGATGAAACCTTTAC 

60 GGAATTGCCGTTCTCCCGTCGTTTGCAGGTTCTTTTTGATGAAGCAGAGTATGTAGCGTCAGTGGTCCATGCTAAG 
GTACTAGGGACAGAGCACGTCCTCTATGCGATTTTGCATGATAGCAATGCCTTGGCGACTCGTATCTTGGAGAGG 
GCTG Gl - rrriLll ATGAAGACAAGAAAGATCAGGTCAAGATTGCTGCTCTTCGTCGAAATTTAGAAGAACGGGCA 
GGCTGGACTCGTGAAGATCTCAAGGCTTTACGCCAACGCCATCGTACAGTAGCTGACAAGCAAAATTCTATGGCC 
AATATGATCGGCATGCCGCAGACTCCTAGTGGTGGTCTCGAGGATTATACGCATGATTTGACAGAGCAAGCGCGT 

65 TCTGGCAAGTTAGAACCAGTCATCGGTCGGGACAAGGAAATCTCACGTATGATTCAAATCTTGAGCCGGAAGACT 



# 
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AAGAACAACCCTGTCTTGGTTGGGGATGCTGGTGTCGGGAAAACAGCTCTGGCGCTTGGTCrrGCCCAGCCn'ATTG 
CTAGTGGTGACGTGCCTGCGGAAATGGCTAAGATGCGCGTGTTAGAACTTGATTTGATGAATGTCGTTGCAGGGA 
CACGCTTCCGTGGTGACTTTGAAGAACGCATGAATAATATCATCAAGGATATTGAAGAAGATGGCCAAGTCATCC 
TCriTATCGATGAACrrCCACACCATCATGGGTTCTGGTAGCGGGATTGATTCGACTCTGGATGCGGCCAATATOT 
5 GAAACCAGCCTTGGCGCGTGGAACrrrGAGAACGGTTGGTGCCACTACTCAGGAAGAATATCAAAAACATATCGA 
AAAAGATGCGGCACrrTCTCGTCGTTTCGCTAAAGTGACGATTGAAGAACCAAGTGTGGCAGATAGTATGACTAT 
TTTACAAGGTTTGAAGGCGACTTATGAGAAACATCACCGTGTACAAATCACAGATGAAGCGGTTGAAACAGCGGT 
TAAGATGGCTCATCGTTATTTAACCAGTCGTCACTTGCCAGACTCTGCTATCGATCTCTTGGATGAGGCGGCAGCA 
ACAGTGCAAAATAAGGCAAAGCATGTAAAAGCAGACGATTCAGATTTGAGTCCAGCTGACAAGGCCCTGATGGAT 

10 GGCAAGTGGAAACAGGCAGCCCAGCTAATCGCAAAAGAAGAGGAAGTACCTGTCTACAAAGACTTGGTGACAGA 
GTCTGATATTTTGACCACCrrrGAGTCGCTTGTCAGGAATCCCAGTTCAAAAACTGACTCAAAC^ 
TATTTAAATCTTGAAGCAGAACTCCATAAACGGGTTATCGGTCAAGATCAAGCTGTTTCAAGCATTAGCCG^^ 
TTCGCCGCAACCAGTCAGGGATTCGCAGTCATAAGCGTCCGATTGGTTCCTTTATGrrCCTAGGGCCTACAGGTGT 
CGGGAAAACTGAATTAGCCAAGGCTCTGGCAGAAGrr cri - i - ri 'GACGACGAATCAGCCCITATCCGCTTTGATATG 

15 AGTGAGTATATGGAGAAATTTGCAGCTAGTCCTCrCAACGGAGCTCCTCCAGGCTATGTAGGATATGAAGAAGGT 
GGGGAGTTGACAGAGAAGGTTCGCAATAAACCCTATTCCGTTCTCCTCTTTGATGAGGTAGAGAAGGCCCACCCA 
GATATCTTTAATGTTCTCTTGCAGGTTCTGGATGACGGTGTCTTGACAGATAGCAAGGGACGCAAGGTCG AI i 1 IT 
CAAATACCATTA TCAT TATGACATCGAATCTAGGTGCGACrGCCCTTCGTGATGATAAGACTGTTG G i i ' l ' i GGGGC 
TAAGGATATTCGTTTTGACCAGGAAAATATGGAAAAACGCATGTTTGAAGAACTGAAAAAAGCTTATAGACCG^ 

20 ATTCATCAACCGTATTGATGAGAAGGTGGTCTTCCATAGCCTATCTAGTGATCATATGCAGGAAGTGGTGAAGATT 
ATGGTCAAGCCTTTAGTGGCAAGTTTGACTGAAAAAGGCATTGACrrGAAATTACAAGCTrCAGCT 
TAGCAAATCAAGGATATGACCCAGAGATGGGAGCTCGCCCACTTCGCAGAACCCTGCAAACAGAAGTGGAGGAC 
AAGTTGGCAGAACTTCTTCTCAAGGGAGATTTAGTGGCAGGCAGCACACrTAAGATTGGTGTCAAAGCAGGCCAG 
TTAAAAnTGATATTGCATAA 

25 

MNYSKALNECIESAyMVAGHFGARYLESWHLLlAMSNHSYSVAGATLNDYPYEMDRLEEVALELTETDYSQDETFTE 
LPFSm-QVLFDEAEYVASWHAKVLGTEHVLYAILHDSNALATRILERAGFSYEDKKDQVKlAALRI^ 
EDLKALRQRHRTVADKQNSMANMMGMPQTPSGGL£DYTHDLT£QARSGKl,EPVIGlmKEISRMIQIl^RKTKN^^^^ 
GOAGVGKTAI^LGiJVQRIASGDVPAEMAKMRVLELDLMNVVAGTi^RGDFEERMNNnmEEDGQVILFmELHTIM 

30 GSGSGIDSTLDAANILKPALARGTLRTVGATTQEEYQKHIEKDAAI^RRFAKVTIEEPSVADSMTn.QGLKATYEKHHRV 
QrrDEAVETAVKMAHRYLTSRHLPDSAIDLlXtEAAATVQNKAJCHVKADDSDI^PADKAUvIDGKWKQ^^ 
PVYKDLVTESDILTTLSRLSGIPVQKLTQTDAKKYLNLEAELHKRVIGQDQAVSSISRAIRRNQSGiRSHKRPlGSFMFLCP 
TGVGKTELAKALAEVLFDDESALHIFDMSEYMEKFAASRLNGAPPGYVGYEEGGELTEKVRNKPYSVLLFDEVEKAHP 
DIFNVLLQVLDDGVLTDSKGRKVDFSNTinMTSNLGATALRDDKTVGFGAKDlRFDQENMEKRMFEELKFCAYRPEFIN 

35 RJDEKVVFHSLSSDHMQEVVKIMVKPLVASLTEKGIDLKLQASALKLLANQGYDPEMGARPLRRTLQTEVEDKLAELL 
UCGDLVAGSTLKIGVKAGQLKFDUZ 

ID40 IQOSbp 

40 ATGAAG AA A ACATGGAAAGTG i i 1 ' 1 1 AACGCTTGT AAC AGCTCTTCTAG CTGTTGTGCTTGTGGCCTGTGGTCAAG 

GAACTGCrrCTAAAGACAACAAAGAGGCAGAACTTAAGAAGGTTGACTTTATCCTAGACTGGACACCAAATACCA 
ACCACACAGGGCTTTATCTTGCCAAGGAAAAAGGTTATTTCAAAGAAGCTGGAGTGGATGTTGATTTGAAArrGC 
CACCAGAAGAAAGTTCTTCTGACTTGGTTATCAACGGAAAGGCACCATTrGCAGTGTATTTCCAAGACTACATGGC 
TAAGAAATTGGAAAAAGGAGCAGGAATCACTGCCGTTGCAGCTATTGTTGAACACAATACATCAGGAATCATCTC 

45 TCGTAAATCTGATAATGTAAGCAGTCCAAAAGACTTGGTTGGTAAGAAATATGGGACATGGAATGACCCAACTGA 
ACrrGCTATGTTGAAAACCTTGGTAGAATCTCAAGGTGGAGACTTTGAGAAGGTTGAAAAAGTACCAAATAACG 
CrCAAACTCAATCACACCGATTGCCAATGGCGTCTTTGATACTGCTTGGATTTACTACGGTrGGGATGG^^^ 
GCTAAATCTCAAGGTGTAGATGCTAACTTCATGTACTTGAAAGACTATGTCAAGGAGTTTGACTACTATTCACCAG 
TTATCATCGCAAACAACGACTATCTGAAAGATAACAAAGAAGAAGCTCGCAAAGTCATCCAAGCCATCAAAAAA 

50 GGCTACCAATATGCCATGGAACATCCAGAAGAAGCTGCAGATATTCTCATCAAGAATGCACCTGAACTCAAGGAA 
AAACGTGACriTGTCATCGAAT CTCAA AAATACTTGTCAAAAGAATACGCAAGCGACAAGGAAAAATGGGGTCAA 
TTTGACGCAGCTCGCTGGAATGCTTTCTACAAATGGGATAAAGAAAATGGTATCCTTAAAGAAGACTTGACAGAC 
AAAGCCTTCACCAACGAATrTGTGAAATAA 

55 MKKTWKVFLTLVTALVAVVLVACGQGTASKDNKEAELKKVDFILDVn-PhrrNHTGLYVAKEKGYFKEACVDVDLKLP 
PEESSSDLV^NGKAPFAVYFQDYMAKKL£KGAGr^AVAAIVEH^rrSGiISRKSDNVSSPKDLVGKKYGTW^^DPTELAML 
KTLVESQGGDFEKVEKVPNNDSNSITPIANGVFDTAWIYYGWDGILAKSQGVDANFMYLKDYVKEFDYYSPVIIANND 
YLKDNKEEARKVIQAIKKGYQYAMEHPEEAADILIKNAPELKEKRDFVIESQKYLSKEYASDKEKWGQFDAARWNAFY 
KWDKENGILKEDLTDKGFTNEFVKZ 

60 

!D41 762bD 

TTGATGAGAA ACTT GAGAAGTATACTGAGACGACACATTAGTCTATTGGGCTTrCTCGGAGTATTGTCAATCTGGC 
AGTTAGCAGGTTTTCTTAAACTTCTCCCCAACTTrATCCTGCCGACACCTCTTGAAATrCTCCAGCCC^ 
65 GACAGAGAATTTCTCTGGCACCATAGCTGGGCGACCTTGAGAGTGGCTTTACTGGGGCTGATTrrGG 
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TTGCCTGTCTTATGGCTGTGCTCATGGATAGTTTGACTTGGCTCAATGACCTGATTTACCCT^ 
CAGACCATTCCGACCATTGCCATAGCTCCn-ATCCTGGTCTTGTGGCTAGGTTATGGGAl^ 
TGATTATCrrAACGACAACCmCCCATCATCGTTAGTATTTTGGACGGTmAGGCArrG^^ 
GACCTTGTTTAGTCTGATGCGGGCCAAGCCnTGGCAAATCCTGTGGCATTrrAAAATCCCAGT^ 
5 TTTTATGCAGGTCTGAGGGTCAGTGTCTCCTACGCCTTTATCACAACTGTGGTATCT^ 

AAGGTCTTGGTGTTTATATGATTCAGTCTAAAAAACTGTTTCAGTATGATACCATGTTTGCCAT^^ 
TCGATrATCAGTCTTTTGGGTATGAAGCrGGTCGATATCAGTGAAAAATATGTGATTAAATGGAAACGTTCGTAG 

MMRNLRSILRRHISLLGFLGVUIWQLAGFLKLLPKFILPTPLEILQPFVRDREFLWHHSWATLRVAL^^ 
10 AVLMDSLTWLNDLIYPMMVVIQTIPTIAIAPILVLWLGYGILPKIVLIILTTTFPIIVSILDGFRHCDKDMLTL 
WQILWHFKIPVSLPYFi^ACU^VSVSYAFriTWSEWLGGFEGLGVYMIQSKKLFQYDTMFAIIILVSIISL^^ 

KYVIKWKRSZ 



15 



ID42 372bp 



TTGATTTTTAATCCTATTTGCrGTATGATAAGGGAAAAGAAAGGGGACAGAGAT^Vr^^ 
TGCGATCTGCTAGTTTTGGTATTGTTACCAGCTTGCCTGATGACATCATTGACTCT^^ 
TTCTTAAAAAATGTCTTTGAATTGGAAGAAGAACTCGAGTTTCAATTGCTT/u^^ 
ACrrrrCAAGTCAACACCTCCCTACAGCCArrGATTTTGACTTTAACCATCCTTTC^ 
20 GTACTGGTTTTAGACATGGACGGTAGAGAAACTATCCTCCrCCCAGAAGAAAATGACCTATTTTAA 

MlFNPlCCMIRBKKGDRDMAFTOTHMRSASFGrVTSLPDDIIDSFWYIIDHFLKWFELEEELEFQLLNNQGKIT^^ 
HLPTAIDFDFNHPFDPRYPPRVLVLDMDGRETILLPEENDLFZ 

25 ID43 1569bp 

ACAGCGGTGTCATTCTATCTATTTTAAGAAAAGTAATAATCAArrGTTAAAAATAGTAAAAAAATO 
ATGAAATATTTTGTTCCTAATGAGGTATTCAGTATrCGTAAATTAAAGGTGGGGACTTGCTCGGTACTATTG 
TTTCAATTTTGGGAAGCCAAGGTATTTTATCGGATGAAGTTGTTACTAGTTCTTCAC^^ 
30 TrCTAATGCAATTACTAATGArrrAGATAATTCACCAACTGTTAATCAGAATCGTTCTGCTGAAATGAT^^ 
AATTCAACCACTAATGGTTTAGATAATrCGTrAAGTGTrAATAGCATCAGCTCTAATGGTACTATTCGT^ 
CACAATTAGACAACAGAACAGTTGAATCTACAGTAACATCTACTAATGAAAATAAGAGTTATAAGGAAGATGTTA 
TAAGTGACAGAATTATCAAAAAAGAATTTGAAGATACTGCTTTAAGTGTAAAAGATTATGGTGCAGTAGGTGATG 
GGATTCATGATGATCGACAAGCAATTCAAGATGCAATAGATGCTGCAGCTCAAGGGCTAGGTGGAGGAAATGTAT 

3 5 ATTTTCCTG A AGG AACTTATITAGTA A AAG AAATTGTTTTTTr AA AA^ 

AGCTACAATTCTAAATGGTATAAATATTAAGAATCACCCITCCATTGTrTTrATGACAGGTTTA^ 
GGTGCGCAAGTAGAATGGGGCCCAACAGAAGATATTAGTTATTCTGGTGGTACGATTGATATGAACGGTGCTTTG 
AATGAAGAAGGAACTAAAGCAAAAAATCTACCACTTATAAATTCTTCAGGTGCATTTGCTATTGGGAATO 
AACGTAACTATAAAAAATGTAACATTCAAGGATAGTTATCAAGGGCATGCTATTCAAATTGCAGGTTCGAAAAAT 

40 GTATTAGrrGATAATTCTCGTTTTCTTGGGCAAGCCTTACCCAAAACGATGAAGGATGGGCAAATCATAAG^ 
AGAGCATTCAGATTGAACCATTAACTAGAAAAGGTTTTCCTTATGCCrTGAATGATGATGGGAAAAAATCT^ 
ATGTGACTArrCAAAATTCCTATTTTGGCAAAAGTGATAAATCTGGGGAAT TAGT AACAGCAATTGGCACACACT 
TCAAACATTGTCGACACAGAACCCCTCTAATATTAAAATTCAAAATAATCATTTTGATAACATGATGTATGCAGGT 
GTACGTTTTACAGGATTCACTGATGTATTAATCAAAGGAAATCGCTTTGATAAGAAAGTTAAAGGAGAGAGTGTA 

45 CATTATCGAGAAAGCGGAGCAGCTTTAGTAAATGCTTATAGCrATAAAAACACTAAAGACCTATTAGATr^ 

AAACAGGTGGrrATCGCCGAAAATATATTTAATATTGCCGATCCTAAAACAAAAGCGATACGAGTTGCAAAAGAT 
AGTGCAGAATGTTTAGGAAAAGTATCAGATATTACTGTAACAAAAAATGTAATTAATAATAATTCTAAGGAAACA 
GAACAACCAAATATTGAATTATTACGAGTTAGTGATAATTTAGTAGTCTCAGAGAATAGT 

50 QRCHSIYFKKSNNQLLK1VKKLEVLMKYFVPNEVFSIRKLKVGTCSVLLAISILGS<K3I1^DEVVTSSSPMATKESSNA^^ 
DLDNSPTVNQNRSAEMIASNSTTNGLDNSLSVNSISSNGTIRSNSQLDNRTVESTVTSTNENKSYKEDVISDRIIKKEFEDT 
AL^VKDYGAVGCKjIHDDRQAIQDAIDAAAQGLGGGNVYFPEGTYLVKEIVFLKSHTHLELNEKATU^GINIKNHPSIVF 
MTGLFTDDGAQVEWGPTEDISYSGGTIDMNGALNEEGTKAK^LPUNSSGAFAIGNSNNVTIKNVTFKDSYQGHAIQIA 

gsknvlvdnsrflgqalpktmkdgqiiskesiqiepltrkgfpyalnddgkksenvtiqnsyfgksdksgelvtaigthy 
55 qtl5tqnpsnikiqnnhfdnmmyagvrftgftdvlikgnrfdkkvkgesvhyresgaalvnaysykntkdlldlnkq 
vviaenifniadpktkairvakdsaeclgkvsditvtknvinnnsketeqpniellrvsdnlvvsens 

ID44 324bp 

60 GTGATGAAAGAAACTCAGCTATTAAAAGGTGTTCTTGAAGGTTGTGTCTTGGATATGATTGGTCAAAAAG^ 
TATGGTTATGAGTTGGTrCAGACTTTGCGAGAGGCTGGATTTGATACTATCGTTCCAGGAACTATTTAT^ 
GCAAAAGTTAGAAAAAAATCAATGGATAAGAGGCGACATGCGCCCGTCGCCAGATGGTCCAGATCGGAAGTATTT 
TTCATTAATGAAAGAAGGAGAAGAGCGTGTCTCAGTCTTTTGGCAACAATGGGACGATTTGAGTCAAAAAGTAGA 
AGGGATTAAGAATGGGGGTTAA 

65 
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MMKirTOLLKGVLEGCVLDMIGOKERyGYELVQTLREAGFDTIVPGTIYPLLQlCLEKNQWIRGDMRPSPDGPDRKYFSL 
MKEGEERVSVFWQQWDDLSQKVEGIKNGG2 

^ ID45 816bp 

ATGAAGAAAATGAAGTATTACGAAGAAACAAGCGCTTTCCTACATGAGTTTTCTGAGGAGAATCAAAA 
GAGGAGTTGTGGGAAAGTTrrAATCTTGCTGGATTTCTCTATGATGAAGACTATCTCAGAGAGCAGATCTAT^ 
TGATGCTAGATTTCrCAGAAGCAGAACGAGATGGCATGAGTGCAGAGGATTATCTAGGTAAGAATCCTAAAAAAA 
TAATGAAAGAGATTCTCAAGGGAGCACCrCGCAGTTCTATCAAAGAGTCCCTTTTGACGCCAATTCT^ 
1 0 GGTATTACGTTA TTATC AACTACTAAGTGATTTITCTAAAGGTCCTCrCTTAACAGTCAAT^ 
GGCAACTTCTTATTrrrCTGATTGGATrrGGACTTGTGGCCACAAriTrACGAAG^ 
AAAATGAAAATTGGCACTTACATTGTTGTTGGGACTATAGTTCTrCrAGTTGTnTAGGATATCT 
GCTTCATACAAGAAGGAGCCrmATATTC^ 

GGTATTTGGAATTGGAAAGAACCGGTCrrTCGTCCATTTGTCAGTATGATTATrGCCCATCTTGTGGTGGGT^ 
15 GCTCCGTTATTATGAGTGGATGGGAATTTCAAATGTTTTCOT 

GAATCTTTGTCTTGTTCCGTGGGTTTAAGAAGATAAAATGGAGTGAAGTATAG 

MKKMKYYEETSALLHEFSEENQKYFEELWESFNLAGFLYDEDYLREQIYLMMLDFSHAERDGMSAEDYLGKNPKKIM 
KEILKGAPRSSIKESLLTPILVLAVLRYYQLLSDFSKGPLLTVmLTFLGQLLIFLIGFGLVATILRRSLVQDSPKMKIGTYI 
20 VVGTIVLLVVLGYVGMASFIOEGAFYIPAPWDSLSVFTISLVIGIWNWKEAVFRPFVSMIIAHLVVGSLLRYYEWMGISN 
VFLTKVIPLAVLHGIFVLFRGFKKIKWSEVZ 

ir>46 348bp 

25 CTGTTTTTrTATTTATAa*CAATGAAAATCAAAGAGCAAACTAGGAAGCn-AGCCGCAGGTTGCT^ 
,TTGAGGrrGTAGACGAAACTGACGAAGTCAGCTCAAAACATGTTTTTGAG 
GCrCAAAACACTGTTrrGAGGTTGTAGATGAAACTGACGAAGTC^^ 

AAACTGACCAAGTCAGCTCAAAACATGTTrn-GAGGTTGTAGATGAAACTGACGAAGTCAGTAACCATACAT^^ 
GTAGGGCGACGCTGACGTGGTTTGAAGAGATnTCGAAGAGTATTAA 

30 

MFFYLYSMKIKEQTRKLAAGCSKHCFEVVDETDEVSSKHVFEVVDHTDEVSSKHCFEVVDETDEVSSKHCFEVVDETD 
EVSSKHVFEVVDETDEVSNHTYGRATLTWFEEIFEEY2 

ID47 1260bp 

35 

ATGCAGAATCTGAAATTTGCCTTrrCATCTATCATGGCTCACAAGATGCGTTCTTTGCT^ 

TATCGGTGTTTCATCAGTrGTT GTGATTA TGGCirrGGGTGATTCCCTATCTCGTCAAGTCAATAAAGATATGACrA 

AATCT CAGAAAAATATTAGCGTCrrrn'CTCTCCTAAAAAAAGTAAAGACGGGTCTTTTACT^ 

TTTTACGGTTTCTGGAAAGGAAGAGGAAGTTCCTGTTGAACCGCCAAAACCGCAAGAATCCTGGGTCCAAGAGGC 

40 AGCTAAACTGAAGGGAGTGGATAGTTACTATGTAACCAATTCAACGAATGCCATCTTGACCTATCAAGATAAAAA 
GGTTGAGAATGCTAATTTGACAGGTGGAAACAGAACTTACATGGACGCTGTTAAGAATGAAATTATTGCAGGTCG 
TAGTCrGAGAGAGCAAGATTTCAAAGAGTTTGCAAGTGTCATrrrGCTAGATGAGCAATTGTCCATTAGTl^ 
GAATCTCCTCAAGAGGCTATTAACAAGGTTGTAGAAGTCAATGGATTTAGTTACCGGGTCATTGGGGTTTATACTA 
GTCC GGAGGCTAAAAGATCAAAAATATATGGGTTTGGTGGCrrGCCTATTACTACCAATATCTCCCTTGCTGCGAA 

45 irrTAATGTAGATGAAATAGCTAATATTGTCTTTCGAGTGAATGATACCAGTTTAACCCCAACTCTGG 

CTGGCACGAAAAATGACAGAGC TTGCA GGCrrACAACAGGGAGAATACCAGGTGGCAGATGAGTCCGTTGTATTT 
GCAGAAATTCAACAATCGTTTAGTTTTATGACGACGATTATrAGTTCCATCGCAGGGATrrCTCTCT^ 
GAACTGGTGTCATGAACATCATGCTGGrrrCGGTGACAGAGCGCACTCGTGAGATTGGTCTTCGTAAGGCTTTGGG 
TGCAACACGTGCCAATATTTTAATrCAGTTTrTGATTGAATCCATGATTTTGACCTTG^^ 

50 TGACAATTGCAAGTGGTTTAACTGCCTTAGCAGGTTTGTTACTGCAAGGTTTAATAGAAGGTATAGAAGTTGGAG^ 
ATCAATCCCAGTCGCCCTATTTAGTCTTGCAGTTTCGGCTAGTGTTGGTATGATTTTTGGAGTC^ 
AAGGCATCGAAACTTGATCCAATrGAAGCCCTTCGTTATGAATGA 

MQNUCFAFSSJMAHKMRSLLTMIGIIIGVSSVVVIMALGDSLSRQVNKDMTKSQKNISVFFSPKKSKDGSFTQKQSAFTVS 
55 GKEEEVPVEPPKPQESWVQEAAlCUCCVDSYYArrNSTNAILTYQDBCKVENANLTGGNRTYMDAVKNEIlAGRSLREQDF 
KEFASVILLDEELSISLFESPQEAINKVVEVNGFSYRVtGVYTSPEAfCRSKIYGFGGLPITTNISLAANFNVDEIANIVFRVN 
DTSLTPTLGPELARKMTELAGLQQGEYQVADESVVFAEIQQSFSFMTTriSSiAGlSLFVGGTGVMNIMLVSVTERTREIG 
LRKALGATRANILIQFLIESMILTLLGGLIGLTIASGLTALAGLLLQGUEGIEVGVSIPVALFSLAVSASVGMIFGVLPANK 
ASKLDPIEALRYE2 

60 

tD48 705bp 



65 



CTGATGAAGCAACTAATTAGTCTAAAAAATATCTTCAGAAGTTACCGTAATGGTGACCAAGAACTGCAGGTTCTC 
AAAAATATCAATCTAGAAGTGAATGAGGGTGAATTTGTAGCCATCATGGGACCATCTGGGTCTGGTAAGTCCACr 
CTGATGAATACGATTGGCATGTTGGATACACCAACCAGTGGAGAATATTATCTTGAAGGTCAAGAAGTGGCTGGG 
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CTTGGTGAAAAACAACTAGCTAAGGTCCGTAACCAACAAATCGGTTTTGTCnTTCAGCAGTTC^ 
AGCrCAATGCTCTGCAAAATGTAGAATTGCCCrrGATTTACGCAGGAGTTTCGTCTrCAAA^ 
TGAGGAATATrrAGACAAGGTrGAATTGACAGAACGTAGTCACCATrrACCTTCAGAATTATCTGGTGCT^ 
GCAACGTGTAGCCATTGCGCGTGCCTTGGTAAAC.\ATCCTrCTATTATCCTAGCGGATGAACCGACAGGAGCOT 
5 GATACCAAAACAGGTAACCAAATTATGCAATTATTGGTTGATTTGAATAAAGAAGGAAAAACCATTATCATGG^^ 
ACGCATGAGCCTGAGATTGCTGCCTATGCCAAACGTCAGATTGTCArrCGGGATGGGGTCATTTCGTCTGACAGTG 

CrCAGTTAGGAAAGGAGGAAAACTAA 

MMKQLISLKNIFRSYRNGDOELQVLKNlNLEVNEGEFVAlMGPSGSGKSTLMNn-IGMLDTPTSGEYYLEGQEVAGLGEK 
10 QUAKVRNQQIGFVFOQFFLLSKLNALQNVELPLIYAGVSSSKRRKLAEEYLDKVELTERSHHLPSELSGGQKQRVAIARA 
LVNNPSIlLADEPTGALDTKTGNQIMQLLVDLNKEGKTnMVTHEPElAAYAKRQIVlRDGVISSDSAQLGKEENZ 

tT>49 nOQbp 

15 ATGAAGAAAAAGAATGGTAAAGCTAAAAAGTGGCAACTGTATGCAGCAATCGGTGCTGCGAGTGTAGTTGTATTG 
GGTGCTGGGGGGATTTTACTCrrrAGACAACCTTCTCAGACTGCTCTAAAAGATGAGCCTACTCATC^ 
CCAAGGAAGGAAGCGTGGCCTCCTCTGTTTTATTGTCAGGGACAGTAACAGCAAAAAATGAACAATAT GTTTA TT 
TTGATGCTAGTAAGGGTGATTTAGATGAAATCCrrGTTrCTGTGGGCGATAAGGTCAGCGAAGGGCAGGCTTTAGT 
CAAGTACAGTAGTTCAGAAGCGCAGGCGGCCTATGATTCAGCTAGTCGAGCAGTAGCTAGGGCAGATCGTCATAT 

20 CAATGAACTCAATCAAGCACGAAATGAAGCCGCTTCAGCTCCGGCTCCACAGTTACCAGCGCCAGTAGGAGGAGA 
AGATGCAACGGTGCAAAGCCCAACTCCAGTGGCTGGAAATTCTGTTGCTTCTATTGACGCTCAATTGGGTG 
CGTGATGCGCGTGCAGATGCTGCGGCGCAATTAAGCAAGGCTCAAAGTCAATTGGATGCAACAACTGTTCTCAGT 
ACCCTAGAGGGAACTGTGGTCGAAGTCAATAGCAATGTTTCTAAATCTCCAACAGGGGCGAGTCAAGTTATGGT^ 
CATATrGTCAGCAATGAAAATTTACAAGTCAAGGGAGAATTGTCTGAGTACAATCTAGCCAACCrnrCTGTAGGTC 

25 AAGAAGTAAGCTTTACrrCTAAAGTGTATCCTGATAAAAAATGGACTGGGAAATTAAGCrATATTT^^ 

TAAAAACAATGGTGAAGCAGCTAGTCCAGCAGCCGGGAATAATACAGGTTCTAAATACCCTTATACTATTGATGT 
GACAGGCGAGGTTGGTGATTTGAAACAAGGTTTTTCTGTCAACATTGAGGTTAAAAGCAAAACT^^ 
GTTCCTGTTAGCAGTCTAGTAATGGATGATAGTAAAAATTATGTCTGGATTGTGGATGAACAACAAAAGGCTAAA 
AAAGTTGAGGTITCATTGGGAAATGCTGACGCAGAAAATCAAGAAATCACTTCTGGTTTAACGAACGGTG 

30 GTCATCAGTAATCCAACATCTTCCrrGGAAGAAGGAAAAGAGGTGAAGGCTGATGAAGCAACTAATTAG 

MKKKNGKAKKWQLYAAIGAASVWLGAGGILLFRQPSQTALKDEPTHLWAKEGSVASSVLI^GTVTAKNEQYVYFD 
ASKGDLDEILVSVGDKVSEGQALVKYSSSEAQAAYDSASRAVARADRHINELNQARNEAASAPAPQLPAPVGGEDATV 
QSPTPVAGNSVASIDAQLGDARDARADAAAQLSKAQSQLDATTVLSTLEGTVVEVNSNVSKSPTGASQVMVHIVSNEN 
35 LOVKGELSEYNLANLSVGQEVSFTSKVYPDKKWTGKLSYISDYPKNNGEAASPAAGNNTGSKYPYTIDVTGEVGDLXQ 
GFSVNIEVKSKTKAILVPVSSLVMDDSKNYVWIVDEQQKAKKVEVSLGNADAENQErrSGLTNGAKVISNPTSSLEEGKE 

VKADEATNZ 



40 



ID50 759bD 



ATGTCACGTAAACCATTTATCGCTGGTAACTGGAAAATGAACAAAAATCCAGAAGAAGCTAAAGCATTCGTTGAA 
GCAGTTGCATCAAAACTTCCTTCATCAGATCrrGTTGAAGCAGGTATCGCTGCTCCAGCTCTTGArrTGACAAC^^ 
TTCrrGCTGTrGCAAAAGGCTCAAACCTTAAAGTTGCTGCTCAAAACTGCrACTTTGAAA 

TGGTGAAACTAGCCCACAAGTTTTGAAAGAAATCGGTACTGACTACGT TGTTA TCGGTCACTCAGAACGCCGTGA 
45 CTACTTCCATGAAACTGATGAAGATATCAACAAAAAAGCAAAAGCAATCTTTGCGAACGGTATGCTTCCAATCAT 
CTGTTGTGGTGAATCACTTGAAACTTACGAAGCTGGTAAAGCTGCTGAATTCGTAGGTGCTCAAGTATCTGCTGCA 
TTGGCTGGATTGACTGCTGAACAAGTTGCTGCCTCAGTTATCGCTTATGAGCCAATCTGGGCTATCG^^ 
AATCAGCTTCACAAGACGATGCACAAAAAATGTGTAAAGTTGTTCGTGACGTTGTAGCTGCTGACnTGGTCAAG 
AAGTCGCAGACAAAGTTCGTGTTCAATACGGTGGTTCTGTTAAACCTGAAAATGTTGCTTCATAC ATGGC TTGCC^ 
50 AGACGTTGACGGTGCCCTTGTAGGTGGTGCGTCACTTGAAGCTGAAAGCrrCTTGGCTTTGCTTG 
TAA 

MSRKPFIAGNWKMNKNPEEAKAFVEAVASKLPSSDLVEAGIAAPALDLTTVLAVAKGSNLKVAAQNCYFENAGAFTG 
ETSPQVLKEIGTDYWIGHSERRDYFHETDEDINKKAKAIFANGMLPIICCGESLETYEAGKAAEFVGAQVSAALAGLTA 
55 EQVAASVIAYEPIWAIGTGKSASQDDAQKMCKVVRDVVAADFGQEVADKVRVQYGGSVKPENVASYMACPDVDGAL 
VGGASLEAESFLALLDFVK2 

ID51 ]473bp 

60 TTGAAAACAAAAATrGGATTAGCAAGTATCTGTTTACTAGGCrrGGCAACTAGTCATGTCGCTGCAAATGAAACT 
AAGTAGCAAAAACTTCGCAGGATACAACGACAGCTTCAAGTAGTTCAGAGCAAAATCAGTCTTCTAATAAAACGC 
AAACGAGCGCAGAAGTACAGACTAATGCTGCTGCCCACTGGGATGGGGATTATTATGTAAAGGATGATGGTTCTA 
AAGCTCAAAGTGAATGGATTTTTGACAACTACTATAAGGCrrGGTrrTATATTAATTCAGATGGTCGT^^ 
GAATGAATGGCATGGAAATTACTACCTGAAATCAGGTGGATATATGGCCCAAAACGAGTGGATCTATGACAGTAA 

65 TTACAAGAGTTGGTTTTATCTCAAGTCAGATGGGGCTTATGCTCATCAAGAATGGCAATTGATTGGAAATAAGTGG 
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tactacttcaagaagtggggttacatggctaaaagccaatggcaaggaagttatrrcttgaatggtcaaggagct 
atgatgcaaaatgaatggctcratgatccagcctarrctgcttatttttatctaaaatrc^ 
accaagagtggcaaaaagtgggcggcaaatggtactatttcaagaagtggggctatatggctcggaatgagtggc 
aaggcaactactarrtgactggaagtggtgccatggcgactgacgaagtgattatggatggtacrcgctatatctt 
5 tgcggcctctggtgagctcaaagaaaaaaaagatttgaatgtcggctgggttcacagagatc 

ctttaataatagagaagaacaagtgggaaccgaacatgctaagaaagtcattgatattagtgagcacaatggtcg 
tatcaatgatrggaaaaaggttattgatgacaacgaagtggatggtgtcattgttcgtctaggttatagcggtaa 
agaagacaaggaattggcgcataacarraaggagttaaaccgtctgggajkttccttatggtgtctatctctatac 
ctatgctgaaaatgagaccgtgcrgagagtgacgctaaacagaccattgaacttataaagaaatacaatatgaac 
10 ctgtcttaccctatctattatgatgttgagaattgggaatatgtaaataagagcaagagagctccaagtgataca 
ggcacrrgggttaaaatcatcj\acaagtacatggacacgatgaagcaggcgggttatcaaaatgtgtatgtctat 

AGCTATCGTAGTTTATTACAGACGCGTTTAAAACACCCAGATATm 

CGAATGCrTTAGAATGGGAAAACCCTCATTATTCAGGAAAAAAAGGTTGGCAATATACCTCrrCTGAATACAT^ 
AAGGAATCCAAGGGCGCGTAGATGTCAGCGTTTGGTATTAA 

15 

MKTKIGLASlCLLGLATSHVAANETEVAiCTSQDTTTASSSSEQNQSSNKTQTSAEVQTNAAAHWDGDYYVKDDGSKAQ 
SEWlFDrA'YKAWFYINSTCRYSQNEWHGNYYLKSGGYMAQNEWIYDSNYKSWFYLKSDGAVAHQEWQUGNKWYY 
FKKWGYMAKSQWQGSYFLNGQGAMMQNEWLYDPAYSAYFYLKSDGTYANQEWQKVGGKWYYFKKWGYMARNE 
WQGNYYLTGSGAMATDEVIMDGTRYIFAASGELKEKKDLNVGWVHRDGKRYFFNNREEQVGTEHAKXVIDISEHNG 
20 RINDWK^VIDENEVDGVIViU,GYSGKEDKELAHNIKELNRLCIPYGVYLYTYAENETDAESDAKQTIEUICKYNMNLSY 
PIYYDVENWEYV^^CSKRAPSDTGTWK^NKYMDTMKQAGYQ^^VYVYSYRSU-QTRLKHPDILKHVNWVAA^ 
EWENPHYSGKKGWQYTSSEYMKGIQGRVDVSVWYZ 

TD52 774bD 

25 

ATGAAAAAATTTGCCAACCTTTATCTGGGACTGG 
TGCCTTTAATGCTGGTGATCATATGAATAGCIT^ 

GGGAG ACTCATGCTGATTTTGGCTCAGACAi 1 1 i 1 C Tl GGCCTTCCTATCAGCCTTGATAGCGACCArrATCGGGA 
CrmGCTGCCATTTACATCTACCAGTCrCGTAAGAAATACCAAGAAGCCTTTCrAT^ 

30 GGTTGCGCCTGACGTTATGATTGGTGCTAGCTTCrrTGATTCTCrrTTACCCAACTC^ 

CCGTTCTATCTAGTCACGTGGCCTTCTCCATTCCTATCGTGGTCTTGATGGTCTTGCCTCGACTCAAGGAAAT^ 
TGGCGACATGATTCATGCGGCCTAT GACT TGGGAGCTAGTCAATTTCAGATGTTCAAGGAAATCATGCTTCCTTAC 
CTGACTCCGTCT ATCATT ACTGGTTATTTCATGGCCTTCACCTATTCGrrAGATGACTTTGC 
AACAGGAAATGGCrrTTCAACCCTATCAGTCGA 

35 GCCCTGTCTGCTCTAGTCrrrCTCTTTAGTATTATCCTACTTGTAGGTTATTACTTTATCT 
GCAAGCATGA 

MKKFANLYLGLVFtVLYLPIFYLlGYAFNAGDDMNSFTGFSWTHFETMFGDGRLMLILAQTFFLAFLSAUATIIGTFGA 
lYIYQSRKKYQEAFL^LNNILMVAPDVMIGASFLILFTQLKFSLGFLTVLSSHVAFSIPrVVLMVLPRLKEMNGDMIHAAY 
40 DLGASQFQWFKEIMLPYLTPSIITCYFMAFTYSLDDFAVTFFVTGNGFSTLSVEIYSRARKGISLEINALSALVFLFSIILVV 
GYYFISREKEEQAZ 

ID59 107! bp 

45 ATGAAAAAAATCTATTCATTrTTAGCAGGAATTGCAGCGATTATCCTTGTCTTGTGGGGAATTGCGACTC^ 

atagtaaaatcaatagtcgagatagtcaaaaattggttatctataactggggagactatatcgatcctgaactctt 
gactcagtttacagaagaaacaggaattcaagttcagtacgagacttttgactccaacgaagccatgtacactaa 
gataaagcagggtggaacgacctacgatattgccattccaagtgaatacatgattaacaagatgaaggacgaag 
acctcttggttccgcrtgattattcaaaaattgaaggaatcgaaaatatcggaccagagtttctcaaccagtc^ 
50 tgacccaggtaataaattcrccatcccttacntcrggggaaccttaggaattgtct^ 

gaagcgcctgagcattgggatgacctttggaagccggagtataagaattctatcatgctctttgatggggcgcgt 
gaggtgctgggactaggactcaarrccctcggcn-acagccrcaactccaaggatctgcagcagtrggaagagaca 
gtggataagctctacaaactgactccaaatatcaaggctatcgttgcggacgagatgaagggctatatgattcag 

AATAATGTTGCAATCGGCGTGACC TTCTC TGGTGAAGCCAGCCAAATGTTAGAAAAAAATGAAAATCTACGTTAT 
55 GTGGTACC GACA GAGG CCAGC AATCTTTGGTTTGACAATATGGTCATTCCCAAAACAGTTAAAAACCAA^ 
GCarATGCCTTTATCAACTTTATGTTGAAACCTGAAAATGCTCrCCAAAATGCGGAGTATGTCGGC^^ 
CAAACCTACCAGCGAAGGAATTGCTCCCAGAGGAAACAAAGGAAGATAAGGCCTTCTATCCCGATGTTGAAACCA 
TGAAACACCTAGAAGTTTATGAGAAATTTGACCATAAATGGACAGGGAAATATAGCGACCTCTTCCTACAGTTTA 
AAATGTATCGGAAGTAG 

60 

MKKIYSFLAGIAAnLVLWGlATHLDSKINSRDSOKLVIY>rWGDYIDPELLTQFTEETGIQVQYETFDShmA^ 
TTYDIAIPSEYMINKMKDEDLLVPLDYSKIEGIENIGPEFLNQSFDPGNKFSIPYFWGTLGIVYKETMVDEAPEHWDDLW 
KPEYKNSIMLFDGAREVLGLGLNSLGYSLNSKDLQQLEETVDKLYKLTPNIKAIVADEMKGYMIQNNVAIGVTFSGEAS 
QMLEKMENLRYVVPTEASNLWFDNMVIPKTVKNQNSAYAFINFMLKPENALQNAEYVGYSTPNLPAKELLPEETKED 
05 KAFYPDVETMKHLEVYEKFDHKWTGKYSDLFLQFKMYRKZ 
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ID61 1851bp 

ATGAATAAAAAACTAACAGATTATGTGATTCATCTGGTGGAAATTTrAAATAAACA^ 
5 GGAATATTTGATATTTTCAGTATGGTGGTrrCCATCATTGTATCTTATATrTTATTT^ 

ACCTGTTGACTACATTATCTATACGAGTTTGGCCrrCCTGTTCrATCAATTGATGATTGai i i 1 i GGGGGTTGAACG 

CGAGCATTAGTCGTTACAGCAAGATTACGGATTTCATGAAAATCTTTTTTGGT GTGA CT^ 

ATATAGTATCTGTTATGCCTTCTTGCCACTCITCTCCATCCGTTTCATCATTCTC^ 

GATTTTATTGCCACGGATTACTTGGCAGTTAATCTACTCCAGACGCAAAAAAGGTAGTGGTGATGGAGAACACCG 

10 TCGGACCTTCTTGATTGGTGCCGGTGATGGTGGGGCTCrrTTTATGGATAGTTACCAACATCC AAC^^ 

GAACTGGTCGGTATTTTGGATAAGGATTCTAAGAAAAAGGGTCAAAAACTTGGTGGTATTCCTUi 1 1 iGGGCTCTT 
ATGACAATCTGCCTGAATTAGCCAAACGCCATCAAATCGAGCGTGTCATCGTTGCGATTCCGTCGCTGGATCCGTC 
AGAATATGAGCGTATCTTGCAGATGTGTAATAAGCTGGGTGTCAAATGTTACAAG ATGCC TAAGGTrGAAACTGT 
TGTTCAGGGCCTTCACCAAGCAGGTACTGGCTTCCAAAAAATTGATATTACGGACCTTTTGGGTCCT^ 

15 CGTCTTGACGAATCGCGTCTGGGTGCAGAACTGACAGGTAAGACCATCTTAGTCACAGGAGCTGGAGGTTCAATC 
GGTTCTGAAATCTGTCGTCAAGTTAGTCGCTTCAATCCTGAACGCATTGTCTTGCTCGGTCATGGGGA^ 
TCTACCTTGTTrATCATGAATrGATTCGTAAGTTCCAAGGGATTGATTATGTACCTGTGATTGCGGACATTCAAGA 
CTATGATCGTTTGTTGCAAGTCTTTGAGCAGTACAAACCTGCTATTGTTTATCATGCGGCAGCCCACAAGCAT^ 
CCTATGATGGAGCGCAATCCAAAAGAAGCCTTCAAAAACAATATCCGTGGAACTTACAATGTTGCTAAGGCTGTT 

^0 GATGAAGCTAAAGTGTCTAAGATGGTTATGATTTCGACAGATAAGGCAGTCAATCCACCAAATGrTATGGGAGCA 
ACCAAGCGCGTGGCGGAGTTGATTGTCACTGGCrrTAACCAACGTAGCCAATCAACCTACTGTGCAGTTCGTT^ 
GGAATGTTCTTGGTAGCCGTGGTAGTGTCATTCCAGTCnTTGAACGTCAGATTGCTGAAGGTGGGCCTGTAACGGT 
GACAGACTTCCGTATGACCCGTTACTTTATGACCATTCCAGAAGCTAGCCGTCTGGTTATCCATGCTGGTGOT 
GCCAAAGATGGGGAAGTCrrTATCCTTGATATGGGCAAACCAGTCAAGATTTATGACTTGGCCAAGAAGATGGTG 

25 CTTCTAAGTGGCCACACTGAAAGTGAAATTCCAATCGTTGAAGTTGGAA TCCG CCCAGGTGAAAAACTCTACGAA 
GAACTCTTGGTATCAACCGAACTCGTTGATAATCAAGTTATGGATAAGATTTTCGTTGGTAAGGTTAATGTCATGC 
CTTTAGAATCCATCAATCAAAAGATTGGAGAGTTCCGCACTCTCAGTGGAGATGAGTTGAAGCAAGCTATTATCG 
CCTTTGCTAATCAAACAACCCACATTGAATAA 

30 MNKKLTDYVIDLVEILNKQQKQVFWGIFDlFSMVVSnVSYILFYGLINPAPVDYIIYTSLAFLFYQLMIGFWGLNASISRy 
SKrrDFMKIFFGVTASSVl^YSICYAFU>LFSIRFIILPILLSTFLIIXPRn^QLIYSRRKKGSGIXi 

MDSYQHPTSELELVGILDKDSKKKGQiCLGGIPVLGSYDNLPELAKRHQlERVIVAlPSLDPSEyERILQMCNKLGVKCYK 
MPKVETVVQGLHQAGTGFQKIDITDLLGRQEIRLDESRLGAELTGKTILVTGAGGSIGSEICRQVSRFNPERIVLLGHGEN 
SIYLVYHELIRKFQGIDYVPVIADIQDYDRLLQVFEQYKPAIVYHAAAHKHVPMMERNPKEAFKNNIRGTYNVAKAVD 
35 EAKVSKMVMISTDKAVNPPNVMGATKRVAELIVTGFNQRSQSTYCAVRFGNVLGSRGSVIPVFERQIAEGGPVTVTDFR 
MTRYFMTIPEASRLVIHAGAYAKDGEVFILDMGKPVKIYDLAKKMVLLSGHTESEIPIVEVGIRPGEKLYEELLVSTELV 
DNQVMDKIFVGKVNVMPLESINQKIGEFRTLSGDELKQAilAFANQTTHIEZ 

IDlOl I338bp 

40 

ATGATTGAACTTTATGATAGTTACAGTCAAGAAAGTCGAGATrrACATGAAAGTCTA^ 
AACrrGGAGTGGTCATCGATGCAGATGGTTTTCTGCCTGATGGTCTGCTTTCn'CC TTT^ 
GAGGATGGAAAACCrCTCTATTTTAATCAAGTTCCCGTTTCAGATTTITGGGAAATm 

CTTGTATTGAAGATGTGACGCAGGAGAGGGCTGTCATTCATTATGCTGATGGAATGCAGGCTCGCT TGGTT AAACA 
45 GGTAGACTGGAAAGACCTAGAAGGTCCAGTACGTCAGGTTGACCACTACAATCGCTTCGGAGCTTGTTp-GCT^ 
AACGACrrATAGCGCAGATAGCGAGCCGATTATGACAGTTrACCAAGATGTCA ATGGT CAACAAGTTTTACTGGA 
AAACCATGTGACGGGTGATATCTTATTGACrrrGCCAGGTCAGTCCATGCGTTACTTrGCAAATAAAGTO 
ATCAC Cl ILl 1 \ i I GCAAGATTrGGAAATAGATACCAGTCAGCTTATCTTTAATACTCTAGCGACTCCI 1 iCi IGGT 
TTCCTTCCATCATCCAGATAAATCTGGCTCGGATGTCTTGGTATGGCAGGAACCTCTCrATGATGCCATTCCAGGT 
50 AATATGCAGTTGATTTTGGAAAGTGATAATGTGCGTACTAAGAAGATCATCATTCCAAATAAGGCGACTTATGAG 
CGCGCTTTAGAGTTAACTGACGAGAAATACCATGATCAGTTTGTGCACrrGGGTTATCATTACCAGTTCAAAC^^ 
ATAATrrCCTAAGACGAGATGCCTTAATCTTGACCAATTCAGATCAGATTGAGCAAGTAGAAGCAATCGCAGGAG 
CCTTGCCrGATGTCACTTTCCGTATTGCAGCGGTGACAGAGATGTCTTCTAAGCTCTTAGACATGCT^ 
AATGTGGCCCTTTACCAGAACGCTAGTCCACAGAAGATTCAGGAGCTGTATCAACTGTCGGATATTTACTTGGATA 
55 TAAACCACAGTAATGAGTTGCTACAGGCAGTGCGTCAGGCCTTrGAGCACAATCTCnTGATTCTO 

GACGGTGCACAATAGACrrTATATCGCTCCAGACCATCTATTTGAAAGTAGTGAAGTrGCTGCTTTGG^^ 

ATTAAATTGGCCCTrrCAGATGTTGATCAAATGCGTCAGGCACTTGGCAi^ 

ACTTGGTGAGATATCAGGAAACCATGCAAACTGTTTTAGGAGGCTAA 

60 MIELYDSYSQESRDLHESLVATGLSQLGVVIDADGFLPDGLLSPFTYYLGYEDGKPLYFNQVPVSDFWEILGDNQSACIE 
DVTQERAVIHYADGMQARLVKQVDWKDLEGRVRQVDHYNRFGACFATTTYSADSEPIMTVYQDVNGQQVLLENHV 
TGDILLTLPGQSMRYFANKVEFITFFLQDLEIDTSQLIFNTLATPFLVSFHHPDKSGSDVLVWQEPLYDAIPGNMQULES 
DNVRTKKniPNKATYERALELTDEKYHDQFVHLGYHYQFKRDNFLRRDALILTNSDQlEQVEAlAGALPDVTFRIAAVT 
EMSSKLLDMLCYPNVALYQNASPQKIQELYQLSDIYLDINHSNELLQAVRQAFEHNLLILGFNQTVHNRLYIAPDHLFE 

65 SSEVAALVETIKLALSDVDQMRQALGKQGQHANYVDLVRYQETMQTVLGG2 
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10102 ISUbp 

ATGACAATTTACAATATAAATTTAGGAATTGG^ 
5 GTGTITTTCGGAAA TTAA ATCrGTCCTCTAAGTTTATCTTTACAGATATGATrrrAGCCGAT 

ACAGCCAATATTGGTTTTGATGATAATCAGGTTATCTGGCTTTATAATCATTTCACAGATATCAAAATTGCACCT 

CTAGCGTGACAGTGGATGATGTCTTGGCTTACT7TGGTGGTGAAGAAAGTCACAGAGAAAAAA 

TACGTGTATTCTTTmGACCAAGATAAGTTTGTAAC 

TGCCGAGTATGTTTTTAAGGGAAACCTGATTCGGAAGGATTACi I U Li lATACGCGTTATTGTAGCGAGTATnT 
10 GCTCCCAAGGACAATGTTGCAGTCTTATACCAACGAACTTmATAATGAAGACGGGACTCCAGTCT^ 
TGATGAATCAAGGGAAGGAAGAAGTTTATCATTTCAAGGATAAGATTTTCT^ 
CCTrrATGAAATCmGAATTTGAATAAGTCrGATTTGGTCATTCTCGATAGGGAGACAGGT^ 
GTTTGAGGAA GCAC AGACAGCACATCTAGCGGTAGTTG7TCATGCGGAGCATTATAGTGAAAATGCTACAAATGA 
GGACTATATCCTTTGGAATAACTATTATGACTATCAGTTTACCAATGCAGATAAGGTTGACnTC^ 
15 ACTGATAGACAAAATGAAGTTCTACAAGAGCAATTrGCCAAATATACTCAGCATCAGCCAAAGATTGTTACCATT 
CCTGTAGGCAGTATTGATTCCTTGACAGATTCAAGTCAAGGGCGCAAACCATTTTCATTGATTACGGC^ 
TTGCCAAAGAAAAGCACATTGATTGCCTTGTGAAAGCTGTGATTGAAGCTCATAAGGAGTTACCGGAACTAACCT 
TTGATATCTATGGTAGTGGTGG AGAAG ATTCTCTGCTTAGAGAAATTArrGCAAATCATCAGGCAGAGGACTATAT 
CCAACTCAAGGGGCATGCCGAACTTrCGCAGATTTATAGQCAGTATGAGGTCTACTTAACGGCrrCTACCACCGA 
20 AGGATTTGGTCTGACCTTGATGGAAGCTATTGGTTCAGGTCTACCTCTAAT^ 

AGACCTTTATAGAGGATGGGCAAAATGGTTATTTGATTCCAAGTTCATCTGACCATGTAGAAGACCAAATCAAGC 

AAGCrTATGCCGCTAAGATTTGTCAA TTGTA TCAAGAAAATCGTTTGGAAGCTATGCGTGCCTATTCTTAC^ 

TGCAGAAGGCTTCTrGACCAAAGAAATTrTAGAAAAGTGGAAGAAAACAGTAGAGGAGGTGCTCCATGATTGA 

25 MTIYNINLGIGWASSGVEYAQAYRAGVFI^Nl^SKFIFrDMILADNIOHLTANIGFDDNQVIWLYNHFTDIKIAmVT 
VDDVLAYFGGEESHREKNGKVLRVFFFDQDKFVTCYLVDENKDLVQHAEYVFKGNLIRKDYFSYTRYCSEYFAPKDN 
VAVLYQRTFYNEIXlTPVYDILMNQGKEEVYHFKDKrFYGKOAFVRAFMKSLNLNKSDLVILDRETGIGQVVFEEAQTA 
HLAVVVHAEHYSENATNEDYILWNNYYDYQFTNADECVDFFIVSTDRQNEVLQEQFAKYTQHQPKIVTIPVGSIDStTDS 
SQGRKPFSLrrASRLAKEKHIDWLVKAVIEAHKB-PELTFDIYGSGGEDSLLREIIANHQAEDYIQLKGHAELSQlYSQYE 

30 VYLTASTSEGFGLTLMEAIGSGLPLIGFDVPYGNQTFIEDGQNGYLIPSSSDHVEDQIKQAYAAKICQLYQENRLEAMRA 
YSYQUEGFLTKEILEKWKKTVEEVLHD2 



35 



ID103 2292bD 



atgtcctct ctttc ggatcaagaattagtagctaaaacagtagagritcgtcagcgtctttccgagggagaaagtc 

40 tagacgatattttggttgaagcttttgctgtgotgcgtgaagcagataagcggatrrtagggatgtrr 

tgttcaagtcatgggagctatrgtcatgcactatggaaatgttgctgagatgaatacgggggaaggtaagacctt 
gacagctaccatgcctgtctatttgaacgctttttcaggagaaggagtgatggrrgtgactcctaatgagtat^ 
tcaaagcgtgatgccgaggaaatgggtcaagtttatcgrrrtctaggattgaccattggtgtaccatttacggaag 
atccaa agaag gagatgaaagctgaagaaaagaagcttatctatgcttcggatatcatctacacaaccaatagta 

45 atttaggttttgattatctaaatgataacctagcctcgaatgaagaaggtaagtttttacgaccgttt^ 

gatrattgatgaaattgatgatatcrrgcttgatagtgcacaaactcctctgattattgcgggttctcctcgtg 
agtctaattactatgcgatcattgatacacttgtaacaacctrggtcgaaggagaggattatatctt^ 
gaaagaggaggtttggcrcactactaagggggccaagtctgcrgagaattrccragggat^ 
ggaagagcatgcgtcttttgctcgtcatrrggtttatgcgattcgagcrcataagctctt^ 

50 tatatcattcgtggaaatgagatggtactggttgataagggaacagggcgtctaatggaaatgactaaacttcaa 
ggaggtctccat caggct attgaagccaaggaacatgtcaaartatctcctgagacgcgggctatggcctcgatc 
acctatcagagtcttmaagatgtttaataagatatctgctatgacagggacaggtaaggtcgcggaaaaagag 
trrattgaaact taca atatgtctgtagtacgcatrccaaccaatcgtccgagacaacggattgactatccagata 
atctata tatcac titacctgaaaaagtgtatgcatccttggagtacatcaagcaataccatgctaagggaaatcc 

35 tttactcgttrrrgtaggcrcagtrgaaatgtctcaactctattcgtctctcttg 

atgtcctaaatgctaataatgcggcgcgtgaggctcagattatcrccgagtcaggtcagatgggggctgtgacag 
tggctacctctatggcaggacgtggtacggatatcaagcttggtaaaggagtcgcagagcttgggggcttgattg 
ttatrgggactgag cggatg gaaagtcagcggatcgacctacaaattcgtggccgttctggtcgtcagggagatc 
ctggtatgagtaaatttrttgtatccttagaggatgatgttatcaagaaatrrggtccatot 

oq gtacaaagactatcaggttcaagatatgactcaaccggaagtattgaaaggtcgtaaataccggaaactagtcga 
aaaggctcagcatgccagtgatagtgctggacgttcagcacgtcgtcagactctggagtatgctgaaagtatgaa 
tatacaacgggatatagtctataaagagagaaatcgtctaatagatggttctcgtgacttagaggatgttgttgtg 
gatatcattgagagatatacagaagaggtagcggctgatcactatgctagtcgtgaattattgrrrcactttattg 
tgaccaatattagrmcatgttaaagaggtrccagattatatagatgtaactgacaaaactgcagttcgtagctt 

65 tatgaagcaggtgattgataaagaactttctgaaaagaaagaattacttaatcaacatgacttatatgaacag^ 
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TTTACGACTTTCACTGCrTAA.^GCCATTGATGACAACTGGGTAGAGCAGGTAGACTATCTACAAC^^ 

GCTATCGGTGGTCAATCrGCIAGTCAGAAAAATCCAATCGTAGAGTACTATCAAGAAGCCTACGCGGGCrrTGA^ 

GCTATGAAAGAACAGATTCATGCCGATATGGTGCGTAATCTCCTGATGGGGCTGGTTGAGGTCACTCCAAAAGGT 

GAAATCGTGACTCATTTTCCATAA 

^ MSSLSDQELVAKTVEFRQRI^EGESLDDILVEAFAVVREADKRILGMFPYDVQVMGAIVMHYGNVAEMNTGEGKTLT 
ATMPVYLNAFSGEGVMVVTPNEYLSKRDAEEMGQVYRFLGLTIGVPFTEDPKKEMKAEEKKLIYASDIIYTTNSNLGF 
DYLNDKLASNEEGKFLRPFNYVIIDEIDDILLDSAQTPLIlAGSPRVQSNYYAIIDTLVTrLVEGEDYIFKEE^EVWL^^^ 
GAKSAENFLGIDNLYKEEHASFARHLVYAIRAHKLFTBCDKDYIIRGNEMVLVDKGTGRX-MEMTKLQGGLHQAIEAKEH 

10 vKl^PETRAMASITYQSU^MFNKISGMTGTGKVAEKEHETYNMSWRIPTNRPRQRIDYPDNLYaU'EKVYASl^ 

OYHAKGNPLLVFVGSVEMSOLYSSU-FREGIAHNVLNANNAAREAQnSESGQMGAVTVATSMAGRGTDIKLGKGVAE 
LGGLIVIGTERMESQRIDLQIRGI^GROGDPGMSKFFVSLEDDVIKKFGPSWVHKKYKDYQVQDN^QPEVLKGRKYRK 
LVEKAQHASDSAGRSARRQTLEYAESMNlQRDIVYKERNRLIDGSRDLEDVVVDnERYTEEVAADHYASRELLFHFI^ 
NlSFHVKEVPDYIDVTDKTAVRSFMKQVIDKEI^EKKELLNQHDLYEQFLRLSLU:AlDD^nVVEQTOYLQQLSM 

15 QSASQKNPIVEYYQEAYAGFEAMKEQIHADMVRNLLMGLVEVTPKGEIVTHFPZ 

ID104 879bp 

ATGAAACAAGAATGGTTTGAAAGTAATGATnTGTAAAAACAACAAGCAAGAACAAGCCTGAAGAGCAAGCTCA 
20 AGAGGTTGCAGACAAGGCTGAAGAAAGGATACCCGATCTCGATACACCAATTGAAAAAAATACTCAGTTAGAGG 
AGGAAGTCTCTCAAGCTGAAGTCGAATTGGAAAGCCAGCAAGAAGAGAAAATTGAAGCTCCrGAAGACAGT^^ 
GCGAGAACAGAAATAGAAGAAAAGAAGGCATCTAATTCTACTGAAGAAGAGCCAGACCTTTCTAAAGAAACAGA 
AAAAGTCACTATAGCTGAAGAGAGCCAAGAAGCTCTTCCTCAGCAAAAAGCAACCACGAAAGAGCCA^i iCiiAT 
CAGTAAATCTTTAGAAAGTCCTTATATCCCCGACCAAGCTCCAAAATCTAGGGATAAATGGAAAGAGCAAGTGCT 
">5 TGATTTTTGGTCTTGGCTAGTGGAAGCGATCAAATCTCCTACAAGTAAGTTGGAAACAAGTATCACACACAG^ 
ACAGCCTTTCTCTTGCTCATTCTGTrrrCTGCATCTTCL'rinUlCiU-lAGTATCTA TCACATC A^ 
GGACATATAGCAAGCATTAACAGTCGCTTCCCTGAGCAGCTAGCTCCTTTAACTCTTT^ 
AGTAGCGACAACACT CJMCi iLi 1 1 i CATTCCTCTrGGGTAGTTTCGTTGTGAGACGATTTATCCACCAGGAAAAG 
GACTGGACGCTAGACAAGGTrCTCCAACAATATAGTCAACTCTTGGCAATTCCAATCTCCTCACTGCTATTGCT^ 

30 rnci u g ci i ici i i gatagcctacgatttacagccctcttgtgtgtga 

MKQ^WFESNDFVKTTSKNKPEEQAQEVADKAEERIPDLDTPIEKNTQLEEEVSQAEVELESQQEEKIEAPEDSEARTEIE 
EKK^SNSTEEEPDLSKETEKVTIAEESOEALPQQKATTKEPLUSKSLESPYIPDQAPKSRDKWKEQVLDFWSWLVEAIKS 
PTSKLETSITHSYTAFLLLE-FSASSFFFSIYHIKHAYYGHIASINSRFPEQLAPLTLFSIISILVATTLFFFSFLLGSFVVRRFIH 
35 QEKDWTLDKVLQQYSQLLAIPISSLLLLVSLLSLiAYDLQPSCVZ 

ID106 327bp 

ATGTACTTTCCAACATCCTCTGCCTTGATTGAATTTCTCATCTTGGCTGTACTGGAGCAGGGTGATTOT 
40 TGAGATTAGCCAAACCATTAAGCTGATCGCTAATATCAAAGAATCCACACn'CTATCCCATTCTCAAAAAATTGGA 
AGGCAATAGCTTTCTGACAACCTATTCTAGAGAGTTCCAAGGTCGCATGCGCAAATACTACTCCTrGACAAACGG 
TGGTATAGAGCAGCTCTTGACCCTAAAAGATGAATGGGCACTCTATACAGACACCATCAATGGCATCATAGAAGG 

GAGTATCCGCCATGACAAGAACTGA 

45 MYFPTSSAUEFLILAVLEQGDSYGYEISQTIKLIANIKESTLYPILKKLEGNSFLTTYSREFQGRMRKYySLTNGGIEQLLT 
LKDEWALYTDTINGIIEGSIRHDKN2 

TP108 954bp 

50 ATGGATTTTGAAAAAATTGAACAAGCrTATATCTATTTACTAGAGAATGTCCAAGTCATCCA^ 

CCAACTTrTATGACGCCTTGGTGGAGCAAAATAGCATCTATCTGGATGGTGAAACTGAGCTAAACCAGGTCAAAG 
ACAACAATCAGGCCCTTAAGCGTTTAGCACTACGCAAAGAAGAATGGCTCAAGACCT ACCAGmCr CTTGATGA 
AGGCTGGGCAAACAGAACCCITGCAGGCCAATCACCAGTTTACACCGGATGCTATTGCTTTGci 1 1 ICGTGTTTAT 
TGTGGAAGAGTTGTTTAAAGAGGAGGAAATTACTATCCTCGAAATGGGTTCTGGGATGGGAATTCTAGGCGCTAT 

55 TTTCTTGACCTCGCTTACTAAAAAGGTGGATTACTTGGGAATGGAAGTGGATGATTTGCTGATTGATCT^ 

AGCATGGCAGATGTAATTGGTTTGCAGGCTGGCTTTGTCCAAGGAGATGCCGTTCGCCCACAAATGCTCAAAGAA 
AGCGATGTGGTCATCAGTGACrrGCCTGTCGGCTATTATCCTGATGATGCCGTTGCGTCGCGCCATCAAGTTGCTT 
CTAGCCAAGAACATACTTACGCCCATCACTTGCTCATGGAACAAGGGCTTAAGTACCTCAAGTCAGACGGATACG 
CT ATTTTTCTAGCTCCGAGTGATTTGTTGACCAGTCCTCAAAGTGATTTGTTAAAAGAATGGCTC 

60 GAGTCTGGTTGCTATGATrAGTCTGCCTGAAAATCTCTTTGCTAATGCCAAACAATCTAAGACTAT ^^ 
AGAAGAAAAATGAAATAGCAGTAGAGCCTTTTGTTTATCCACTTGCTAGCTTGCAAGATGCAAGTGT^ 
ATTTAAAGAAAATTTTCAAAAATGGACTCAAGGTACTGAAATATAA 

MDFEKIEQAYIYLL£>ArQVIQSDLATNFYDALVEQNSIYLDGETELNQV}CDNNQALKRLALRKEEWLKTYQFLLMKA 
65 GQTEPLQANHQFTPDAlALLLVFIVEELFKEEErriLEMGSGMGILGAIFLTSLTKKVDYLGMEVDDLUDLAASMADVl 
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GLQAGFVQGDAVIU>OMLKESDVVISDLPVGYYPDDAVASRHQVASSQEHTYAHHLLMEQGLKYUCSDGYAIFLAPSD 
LLTSPQSDLLKEWLKEEASLVAMISLPENLFANAKQSKTIFILQKKNEIAVEPFVYPLASLQDASVLMKFKENFQKWTQG 
TEIZ 

5 IDllO 1902bp 

ATGATTATTTTACAAGCTAATAAAATTCAACGTTCTTTTGCAGGAGAGGrrCTT^ 

TTGATGAACGAGATCGGATTGCTCrrGin'GGGAAAAATGGTGCAGGTAAGTCTACTCrinUGAAGATTTTAGTTGG 
AGAAGAGGAGCCAACTAGCGGAGAAATCAATAAGAAAAAAGATATTTCTCTGTCTTACCTAGCCCAAGATAGCCG 

10 TTTTGAGTCTGAAAATACCATCTACGATGAAATGCTTCATGTCnTTAATGATTTGCGTC^ 

CGTCAGATGGAGCTGGAGATGGGTGAAAAGTCTGGTGAGGATn'GGATAAACTGATGTCAGATTATGACCGCTTA 
TCTGAGAATTTTCGCCAAGCAGGTGGCrrrACCTATGAAGCTGATATTCGAGCGATTTTG;^ 
ACGAGTCTATGTGGCAGATGAAAATTGCTGAGCmCTGGTGGTCAAAATACTCGTTTGGCACTTGC^ 
CCTTGAAAAGCCCAATCTCrrGGTCTTGGACGAGCCAACTAACCACrTGGATATTGAAACCATCG 

15 GAATTACrrGGTAAACTATAGCGGTGCCCTCATrATCGTCAGCCACGACCGTTAll'l'Cl IGGACAAGGTTGCGA 

ATTACGCTAGATTTGACCAAGCATTCCTTGGATCGCTATGTGGGGAATTACTCTCGTTTTGTCGAATTGAAGGAG 
AAAAGCTAGTTACTGAGGCAAAAAACTATGAAAAGCAACAGAAGGAAATCGCTGCTCTGGAAGACTTTGTCAATC 
GCAATCTAGTTCGTGCTTCAACGACTAAACGTGCTCAATCTCGCCCTAAACAACTAGAAAAAATGGAGCGm 
ACAAGCCTGAAGCTGGCAAGAAAGCAGCCAACATGACCTTCCAGTCTGAAAAAACGTCGGGCAATGTTGTTT^ 

20 CTGrrGAAAATGCAGCTGrrGGCTATGACGGGGAAGTCrrGTCACAACCTATCAACCTAGATCTTCGTAAGATGAA 
TGCTGTCGClATCGTTGGTCCAAATGGTATCGGCAAGTCAACCrrrATCAAGTCrATTGTGGACCAGATTCCTm 
ATCAAGGGAGAAAAGCGCmGGCCCTAATGTTGAGGTTGGTTACTATGACCAAACCCAAAGCAAGCTGACACCA 
AGTAATACGGTGCTGGATGAACTCTGGAATGATTTCAAACTGACACCAGAAGTTGAAATCCGCAACCGTCTTGGA 
GCCTTCCri'lUCTCAGGAGATGATGTTAAAAAATCAGTCGGCATGCTATCTGGTGGCGAAAAAGCTCGTTTGC^ 

25 TAGCTAAATTGTCTATGGAAAACAATAACll-ri-lGATTCTGGATGAGCCGACCAACCACTTGGATATTGATAG 
GGAAGTGCrAGAAAATGCCTTGArrGACTrTGATGGAACCTTGCTGTTTGTCAGTCATGATCGTTA 
CGTGTGGCAACTCATGTmGGAATTGTCTGAGAATGGTTCAACTCTCTACCTTGGAGATTACGACTACTATG^^ 
AGAAGAAAGCAACAGCAGAAATGAGTCAGACTGAGGAAGCTTCAACTAGCAATCAAGCAAAGGAAGCAAGTCCA 
GTCAATGACTATCAGGCCCAGAAAGAAAGTCAAAAAGAAGTTCGCAAACTCATGCGACAAATCGAAAGTCTACA 

30 AGCTGAAATTGAAGAGCTAGAAAGTCAAAGCCAAGCCATTTCTGAACAAATGTTGGAAACAAACGATGCC 

AACTCATGGAATTACAGGCTGAGCTGGACAAAATCAGCCATCGTCAGGAAGAAGCTATGCTTGAGTGGGAAGAAT 
TATCAGAGCAGGTGTAA 

MIILQANKIERSFAGEVLFDNINLQVDERDRULVGKNGAGKSTLLKILVGEEEPTSGEINKKKDISLSYLAQDSRFESENT 
35 lYDEMLHVFNDLRRTERQLRQMELEMGEKSGEDLDKLMSDYDRLSENFRQAGGFTYEADIRAILNGFKFDESMWQMK 
lAELSGGQNTRLALAKMLLEKPNLLVLDEPTNHLDrETIAWLENYLVNYSGAUrVSHDRYFLDKVATITLDLTKHSLDR 
YVGNYSRFVELKEQKLVTEAKNYBKQQKEIAALEDFVNRNLVRASTTKRAQSRRKQLEKMERI-DKPEAGKKAANMTF 
OSEKTSGNVVLTVENAAVGYDGEVLSQPINLDLRKMNAVAIVGPNGIGKSTFIKSIVDQIPFIKGEKRFGANVEVGYYDQ 
TQSKLTPSrrrVLDELWNDFKLTPEVEIRNRLGAFLFSGDDVKKSVGMLSGGEKARLLLAKLSMENNKFLILDEPTNHU 
40 DIDSKEVLENALIDFDGTLLFVSHDRYHNRVATHVLELSENGSTLYLGDYDYYVEKKATAEMSQTEEASTSNQAKEAS 
PVNDYQAQKESQKEVRKLMRQIESLEAEIEELESQSQAISEQMLETNDADKLMELQAELDKISHRQHEAMLEWEELSEQ 
V2 

IDlll in9bp 

45 

ATGAATCGCTATGCAGTGCAGTTGArrAGCCCTGGGGCTATCAATAAAATGGGAAATATGCTCTATGATTATGGA 
AATAGTGTCTGGTTGGCTTCTATGGGGACrATAGGACAGACAGTTrrAGGAATGTATCAGATTTCrGAGCTCGT^ 
CATCTATTCTCGTCAATCCCmGGCGGAGTTATTTCAGACCGTTTTrCTCGTCGTAAGATrrT 
CrrGTTTGTGGGATTCrrTGTCrGGCTA)' 1 I C iTl CATAAGGAATGATAGCTGGATGATTGGCGCnTGATTGTTGC 
50 TAACATTGTGCAGGCTATTGCTTTTGCCTTTTCTCGCACAGCCAATAAAGCTATCATAACTGAAGTG^ 
GATGAGATTGTGATCTATAATTCrCGCTTAGAGCTGGTTTTGCAGGTTGTAGGTGTTAGCTCTCCTGT^ 
CCTTGrmACAGTTTGCAAGTCrCCATATGACGCTACTGCrAGACTCGCTGACTTT^ 

TGGCTTTCCTTCCAAAAGAGGAAGCAAAAGTTCAAGAGAAAAAGG Cl I I 1 ACTGGGAGAGATATTTTTGTAGATA 
TCAAGGATGGGrrACACTATATCTGGCAT CAGC AAGAAA'irriCrJ^CCriTrGCTGGTAGCTTCCAGCGTTAATTT 
55 CTTTTTTGCAGCTTTTGAATTTCTACTTCCtnT^ 

TAACTATGGGGGCTATTGGTTCCATCATTGGGGCTCTTCTAGCTAGTAAAATTAAAGCTAATATr^ 
GATTTrACTGGCTTTGACAGGTGTCGGAGim 

ATTTAGTTTGTGAATTGTTrATGAC GATnT TAATATTCA C ' l iTlTl ACTCAAGTACAAACCAAGGTTGAGAGCGAA 
TrrmGGAAGAGTACTGAGTACAATTTTrACCTTAGCTATTCTATTTATGCCT^ 
60 CTTGCCAAGTGTCCATCTTTATTCTTTCTTGATTATTGGACTTGGAGTTGTAGCm 
ATGrrCGAACTCATTTTCAAAAArrGATATAA 

MNRYAVQLISRGAINKMGNMLYDYGNSVWLASMGTIGQTVLGMYQISELVTSILVNPFGGVISDRFSRRKILMTADLV 
CGILCLAlSFIRNDSWMlGALIVANIVOAlAFAFSRTANKAirrEVVEKDEIVIYNSRLELVLQVVGVSSPVLSFLVLQFASL 
65 HMTLLLDSLTFFIAFVLVAFLPKEEAKVQEKKAFTGRDIFVDIKDGLHYIWHQQEIFFLLLVASSVNFFFAAFEFLLPFSN 
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QLYGSEGAYASILTMGAIGSIIGALLASiaKANIYNlXILLALTGVGVFMMGLPLPTFl^FSGNLVCELFMTIFNIHFn^ 
QTKVESEFLGRVl^IFTLAILFMPIAKGFMTVLPSVHLYSFUIGLGVVALYFLALGYVRTHFEKUZ 

^ ID1I3 2466bp 

atgcaaaatcaattaaatgaattaaaacgaaaaatgctggaatttitccagcaaaaacaaaaaaatai^^ 

gctagacctggcaagaaaggttcaagtaccaaaaaatctaaaaccttagataagtcagccattttcccagctatt 

ttactgagtataaaagccttatttaacttactctttgtactcggtmcta 

cnrrgggatacggagtggccttatttgacaaggtrcgggtgcctcagacagaagaattggtgaatcaggtcaagg 
10 acatctcrrctatttcagagattacctattcggacgggacggtgattgcitccatagagagtgattr a 

ttctatctcatctgagcaaatttcggaaaatctgaagaaggctatcattgcgacagaagatgaacactt^ 
acataagggtgtagtacccaaggcggtgattcgtgcgaccntggggaaatttgtaggtttgggttcctctag 
ggttcaaccttgacccagcaactaattaaacagcaggtggttggggatgcgccgaccrtggctcgtaaggcggc^ 
gagattgtggatgctcttgccttggaacgcgccatgaataaagatgagattttaacgacctatctcaatct^ 
15 cctttggccgaaataataagggacagaatattgcaggggctcggcaagcagctgagggaattttcggtgtagat^ 
ccagtcagtrgactgrrcctcaagcagcatttttagcaggacttccacagagtcccattact^ 
aaatactggggagttgaagagtgatgaagacctagaaattggcnaagacgggctaaggcaoi'lci'i'iacagtat 
gtatcgtacaggtgcattaagcaaagacgagtattctcagtacaaggattatgaccttaaacaggacttrttacc 

ATCGGGCACGGTTACAGGAATTTCACGAGACTATTTATACmACAACTTTGGCA 
20 GACTATCTAGCTCAGAGAGACAATGTCTCCGCTAAGGAGTTGAAAAATGAGGCAACTCAGAAGTm 

TTGGCAGCCAAGGAAATTGAAAATGGTGGTTATAAGATTACTACTACCATAGATCAGAAAATTCATTCTGCCATG 
CAAAGTGCGGTTGCIGATTATGGCTATCl'rriAGACGATGGAACAGGTCGTGTAGAAGTAGGGAATGTCTTGATG 
GATAACCAAACAGGTGCTATTCTAGGCTTTGTAGGTGGTCGTAATTATCAAGAAAATCAAAATAATCATGCCm 
ATACCAAACGTTCGCCAGCTTCTACTACCAAGCCCnTGCTGGCCTACGGTATTGCTATTGACCAGGGCTTG 
25 AAGTGAAACGATTCTATCTAACTATCCAACAAACTrrGCTAATGGCAATCCCATTATGTATGCrAATAGCAAGGG 
AACAGGAATGATGACCTTGGGAGAAGCTCTGAACTATTCATGGAATATCCCTGCTTACTGGACCTATCGTATGCTC 
CGTGAAAAGGGTGTTGATGTCAAGGGTTATATGGAAAAGATGGGTTACGAGATTCCTGAGTACGGTATTGAGAGC 
TTGCCAATGGGTGGTGGTATTGAAGTCACAGrrGCCCAGCATACCAATGGCTATCAGACCTTAGCTAATAATGGA 
GTTTATCATCAGAAGCATGTGATTTCAAAGATTGAAGCAGCAGATGGTAGAGTGGTGTATGAGTATCAGGATAAA 
30 CCGGTTCAAGTCTATTCAAAAGCTACTGCGACGATTATGCAGGGATTGCTACGAGAAGTTCrATCCTCT^ 
CAACAACCITCAAGTCTAACCrGACrTCTrrAAATCCTACTCTrGGCT 

aaccaaccaagacgaaaatatgtggctcatgctttcgacacctagattaaccctaggtggctggattgggcatga 
tgataatcattcattgtcacgtagagcaggttattctaataacrctaattacatggctcatctggtaaatgcgatt 
cagcaagcttccccaagcatttgggggaacgagcgcmgctttagatcctagtgtagtgaaatcgga^ 
35 aaatcaacaggtcaaaaaccagagaaggtttctgttgaaggaaaagaagtagaggtcacaggttcgactgttacc 
agctattgggctaataagtcaggagcgccagcgacaagttatcgctttgctattggcggaagtgatgra 
cagaatgcttggtctagtattgtggggagtctaccaactccatccagctccagcagttcaagtagtagttctagcg 
atagcagtaactcaagtactacacgaccntcttcttcaagggcgagacgataa 

40 mqnqlnelkrkmleffqqkqknkksarpgkkgsstkksktldksaifpaillsikalfnllfvlgflggmlgagulgy 
gvalfdkvrvpqteelvnqvkdissiseitysdgtviasiesdll.rtsisseqisenlickanatedehfkehkgvvpkavtr 
atlgkfvglgsssggstltqqlikqqvvgdaptlarkaaeivdalaleramnkdeilttylnvapfgrnnkgqniaga 
rqaaegifgvdasqltvpqaaflaglpqspityspyentgelksdedleiglrrakavlysmyrtgalskdeysqykdy 
dlkodflpsgtvtgisrdylyfttlaeaqermydyi-aqrdnvsakelkneatqkfyrdlaakeienggykitttidqki 

45 hsamqsavadygyllddgtgrvevgnvlmdnqtgailgfvggrnyqenonnhafdtkrspasttkpllaygiaidqg 
lmgsetilsnyptnfangnpimyanskgtgmmtlgealnyswnipaywtyrmlrekgvdvkgymekmgyeipeygie 
slpmgggievtvaqhtngyqti^nngvyhqkhviskieaatcrvvyeyqdkpvqvyskatatimqgllrevlssrvtt 
tfksnltslnptuu^iadwigktgttnqdenmwlmlstprltlggwighddnhslsrragysnnsnymahl^ 
spsiwgnerfaldpsvvksevlkstgqkpekvsvegkevevtgstvtsywanksgapatsyrfaiggsdadyqnawssi 

50 vgslptpsssssssssssdssnssttrpsssrarr2 

ID114 1974bp 

ATGAAAAAATTTTATCTAAGTCCAATrTTTCCT 

5 5 TATTmGTTAATAATAATCTGTTGACGGTTTTAATTTTG i i' i C l l 'r il GTAGGAGGCTATGTTnTTTATTTAAGAA 

ACTGAGAOTGCATTATACAAGGAGTGATGTAGAACAGATACAGTATGTAAACCACCAAGCGGAAGAAAGTTTGAC 
AGCTCTATTGGAACAGATGCCTGTAGGTGTrATGAAATTGAATTTATCTTCTGGAGAGGT^ 
TATGCTGAATTGATTTTGACCAAGGAAGATGGTGATrrrGArTTAGAAGCTGTTCAAACGATTATCAA 
TAGGAAATCCGTCTACTTATGCCAAGCTTG^^ 

60 GTATTTTGTAGATGTATCCAGGGAACAAGCCATAACAGATGAATTGGTAACAAGTAGACCAGTGATTGGGATTGT 
CT CrGT GGATAATTA TGAT GATTrGGACGATGAAACTTCTGAGTCAGATATTAGTCAAATCAATAGTTrrGTAG^ 
AATTTrATATCAGAGTrrrCAGAAAAACACATCATGTTTrCTCGTCGGGTAAGTATGGATCGATm 
TGACTACACGGTGCTTGAGGGCrrGATGAATGATAAATTrrCTGTTATrGATGCTTTCAGAGAAGAGTCGAAAC^^ 
AGACAGTTGCCCrrGACCTTAAGTATGGGATTTTOTATGGCGATGGAAATCATGATGAGATAGGGAAAGTO 

65 TGCTCAATTTGAACrrGGCTGAAGTACGTGGTGGCGACCAGGTGGTTGTTAAGGAAAACGACGAAACGAAAAATC 



# 
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CAGmATTTTGGTGGTGGGTCTGCTGCrrCAATCMGCGTACACGGACTCGTACGCGCGCTATGATGACAGCT^ 
TTCAGATAAGATTCGGAGTGTAGATCAGU i ITMGTAGTCGGTCACAAAAATTTAGACATGGATGCTTTGGGCTCT 
GCTGTAGGTATGCAGrrGTTCGCCAGCAATGTGATTGAAAATAGCTATGCTCTTTATGATGAAGAACAAATGTCTC 
CAGATATTGAACGAGCTGTITCATTCATAGAAAAAGAAGGAGTTACGAAGTTGTrGTCTGTTAAGGATGCAATGG 
5 GGATGGTGACCAATCO i I Ll-n Gl IGATTCTTGTAGACCATTCAAAGACAGCCTTAACATTATCAAAAGAATTTTA 

TGATTTATTTACCCAAACCATTGTTATTGACCACCATAGAAGGGATCAGGATTTTCCAGATAATGCGGT^ 
TATATCGAAAGTGGTGCAAGTAGTGCCAGTGAGTTGGTAACGGAArrGATTCAGTTCCAGAArrCTAAGAAAAAT 
CGTTTGAGTCGTATGCAAGCAAGTGTCTTGATGGCTGGTATGATGTTGGATACTAAAAATTTCACCTCGCGAGTAA 
CTAGTCGGACATTTGATGTTGCTAGCTATCTCAGAACGCGCCGAAGTGATAGTATTGCTATCCAGGAAATCGCTGC 

10 GACAGATTTrGAAGAATATCGTGAGGTCAATGAACTTATTTTACAGGGGCGTAAATTAGGTTCAGATGTACT/^^ 
GCAGAGGCTAAGGACATGAAATGCTATGATACAGTTGTTATTAGTAAGGCAGCAGATGCCATGTTAGCCATGTC^ 
GGTATrGAAGCGAGTTTTGTTCTTGCGAAGAATACACAAGGATTTATCTCTATCTCAGCTCGAAOT^ 
TGAATGTACAACGGATTATGGAAGAGTTAGGCGGTGGAGGCCACTTTAATrrGGCAGCAGCrCAAATTAAAGATG 
TAACCTTGTCAGAAGCAGGTGAAAAACTGACAGAAATTGTATTAAATGAAATGAAGGAAAAGGAGAAAGAAGAA 

15 TGA 

MKKFyVSPimLVGLL\FGVL^FIIFVNNNLLTVLILFLFVGGYVFLFKKLRVHYTI^DVEQIQYVNHQAEESLT^ 

QMPVGVMKLNX-SSGEVEWFNPYAELILTKEDGDFDLEAVQTIIKASVGNPSTYAKLGEKRYAVHMDASSGVLYFVDVS 

REQArrDELVTSRPVlGlVSVDNYDDLEDETSESDISQINSFVANBS€FSEKHMMFSRRVSMDRFYLFTDYTVLEGLMN 

20 DKFSVIDAFREESKQRQLPLTLSMGFSYGDGNHDEIGKVALLNLNLAEVRGGDQWVKENDETKNPVYFGGGSAASK 
RTRTRTRAMMTAISDKIRSVDQVFVVGHKNLDMDALGSAVGMQLFASNVIENSYALYDEEQMSPDIERAVSHEKEGV 
TKLLSVKDAMGMVTNRSLULVDHSKTALTI^KEFYDLFTQTIVIDHHRRDQDFPDNAVITYIESGASSASELVTELIQFQ 
NSKKNRI^RJvlQASVLMAGMMLDTKNFTSRVTSRTFDVASYLRTRGSDSlAIQElAATDFEEYREVNELILOGRKLGSDV 
LIAEAK0MKCYDTVVISKAADAMLAMSGlEASFVLAK^^^QGFISISARSRSKL^rVQRlMEELGGGGHFNU^AAQIKD^ 

25 LSEAGEKLTEIVLNEMKEKEKEEZ 

IDllS 663bp 

ATGAAGTGCTOTTATGTCGGCAGACTATGAA^ 

30 ACTCTTGTCTTTGTtCAGACTGTGATTCTACTTrrGAAAGAATTGGGGAAGAGAACT 

AGAGTTGTCAACAAAGTGTCAAGATTGTCAACTTTGGTGTAAAGAGGGAGrrGAAGTCAGTCATAGAGCGATm 
TAOTACAATCAAGCTATGAAGGATTTTTTCAGTCGGTATAAGTTTGATGGAGACnTCCTGTT^ 
GCTTCATTTTTAAGTGAGGAGTTGAAAAAGTACAAAGAGTATCAATTTGTTGTAATTCCCCTAAGTCCT 
ATGCTAATAGAGGATTTAATCAGGTTGAGGGCTTGGTAGAGGCAGCAGGCTTTGAGTATCTGGATrrATTAGAGA 

35 AAAGAGAAGAGAGAGCCAGTTCTTCTAAAAATCGTTCAGAGCGCTTGGGGACAGAACTTC Cl VlCl II ATTAAAA 

GTGGACTCACTATTCCTAAAAAAATCCTACTTATAGATGATATCTATACrACAGGAGCAACrATAAATCGTGT^ 
GAAACTGTTGGAAGAAGCTGGTGCTAAGGATGTAAAAACAlT'lTCCCTrGTAAGATGA 

MKCLLCGQTMIC^VLTFSSIXLLRNDDSCLCSDCDSTFERIGEENCPNC^^TELSTKCQDCQLWCKEGVEVSHRAI^^ 
40 NQAMKDFFSRYKFDGDFLLRKVFASFLSEEUCKYKEYQFVVIPLSPDRYANRGFNQVEGLVEAAGFEYLDLLEKREER 
ASSSKNRSERLGTELPFFIKSGVTIPKKILLIDDIYTTGATINRVKKLLEEAGAKDVKTFSLVRZ 

IP116 1299bD 

45 ATGAAAGTAAATTTAGATTATCTCGGTCGTTTATTTACTGAGAATGAATTAACAGAAGAAGAACGTCAGTTGGCG 
GAGAAACTTCCAGCAATGAGAAAGGAGAAGGGGAAACTTrrCTGTCAACGCTGTAATAGTACTATTCTAGAAGAA 
TGGTATTTGCCCATCGGTGCrrACTATTGTCGAGAGTGOT 

ACTATTTrCCGCAGGAGGATTTTCCAAAGCAAGATGTTCTCAAATGGCGCGGCCAATTAACTC L ' lUiU 'C 
GGTGTCAGAGGGATTGCTTCAAGTAGTAGACAAGCAAAAGCCAACCTTAGTTCA7GCGGTAACAGGAGCTGGAAA 
50 GACAGAAATGATTTATCAAGTAGTGGCTAAAGTGATCAATGCGGGTGGTGCAGTGTGTTTGGCTAGTCCTCGCAT 
AGATGTTTGTTTGGAGCTGTACAAGCGCCTGCAACAGGATTmOT 
GAACCTTATTTTCGAACACCACTAGTTGTTGCA 

GATAGTGGATGAAGTAG ATGCnT TTCCTTATGTTGATAATCCCATGCTTTACCACGCTGTCAAGAATAGTGTAAAG 
GAGAATGGATTGAGAATCTTTTTAACAGCGACTTCGACCAATGAGTTAGATAAAAAGGTCCGTTTAG 

55 AAAAGACTGAATTTACCGAGACGGTTTCATGGAAATCCGTTGATTATTCCAAAACCAATT^ 

ATCGCTACT TAGAC AAGAATCGTTTGTCACCAAAGTTAAAGTCCTATATTGAGAAGCAGAGAAAGACAGCTTATC 
CGTTACTCATTTTTGCTTCAGAAATTAAGAAAGGGGAGCAGTTAGCAGAAATCTTACAGGAGCAAT^ 
AGAAAATTGGCTTrGTATCTTCTGTAACAGAGGATCGATTACAGCAAGTACAAGCTTTTCGAGAT^ 
CAATACTTATCAGTACGACAATCrTGGAGCGCGGAGTTACCTTCCCTTGTGTGGATGTTIT 

60 TCATCG TTTGT TTACCAAGTCTAGTTTGATTCAGATTGGTGGACGAGTTGGACGAAGCATGGATAGACCGACAGGA 
GATTTGCI-ri-rcri'CCATGATGGGTTAAATGCTTCAATCAAGAAGGCGATTAAGGAAATTCAGATGATGAATAAGG 
AGGCTGGTCTATGA 

MKVNLDYLGRLFTENELTEEERQLAEKLPAMRKEKGKLFCQRCNSTILEEWYLPIGAYYCRECLLMKRVRSIXJTLYYF 
65 PQEDFPKQDVLKWRGQLTPFQEKVSEGLLQVVDKQKPTLVHAVTGAGKTEMIYQVVAKVINAGGAVCLASPRIDVCL 
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ELYKRLQQDFSCGlALLHGESEPYFRTPLVVATTHOLLKFYQAFDLLrVDEVDAFPYVDNPMLYHAVKNSVKENGUUF 
LTATSTNELI>KKVRLG£LKKLNLPRRFHGNPLIIPKPIWl^DFNRYLDKNRLSPKLKSYIEKQRKTA^TLUFASEI^ 
QLAElLQEQFPNEKIGFVSSVTCDIU^QVQAPRDGELmrSTriLERGVTFPCVDVFVVEANHRLFrKSSLIQIGG 
^ MDRPTGDLLFFHDGLNASIKKAIKEIQMMNKEAGLZ 

IP117 87QbD 

ATGCAAATTCAAAAAAGTmAAGGGGCAGTCTCCCTATGGCAAGCTGTATCrAGTGGCAACGCCGATTGGCAA^ 
CTAGATGATATGACTTTTCGTGCTATCCAGACCTTGAAAGAAGTGGACTGGATTGCTGCTGAGGATACGCGCAAT 
10 ACAGGGCTTTTGCTCAAGCATTTTCACATTTCCACCAAGCAGATCAGTmCATGAGCACAA^^ 
ATTCCTGATTTGATTGGTTTCTTGAAAGCAGGCCAAAGTATTGCTCAGGTCrCTGATGCCGGTr^ 
CAGACCCrrGGTCATGATTTAGTTAAGGCAGCTATTGAGGAAGAAATrGCAGTTGTGACAGTrCCAGGTGCCrCTGC 
AGGAATTrCTGCCTT GATTGCC AGTGGTTTAGCGCCACAGCCACATATCri il'ACGG 1 i'l 1 1'i ACCGAGAAAATCA 
GGTCAGCAGAAGCAATTTTTTGGCTTGAAAAAAGATTATCCTGAAACACAGATTTmATGAATCAC^ 
15 TAGCAGACACGTTGGAAAATATGTTAGAAGTCTACGGTGACCGCTCCGTrGTCTTGGTCAGGGAATTGACCAAAA 
TCTATGAAGAATACCAACGAGGTACTATCTCTGAGTTATTAGAAAGCATTGCTGAAACGCCACTCAAGGGCGAAT 
GTCTTCTCATTGTTGAGGGTGCCAGTCAGGGTGTGGAGGAAAAGGACGAGGAAGACTTGTTCGTAGAAATTCAAA 
CCCGCATCCAGCAAGGTGTGAAGAAAAACCAAGCTATCAAGGAAGTCGCTAAGATTTACCAGTGGAATAAAAGTC 
AGCTCTACGCTGCCTACCACGACTGGGAAGAAAAACAATAA 



20 
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MOlOKSFKGQSPYGKLYLVATPIGNLDDNn"FRAlQTLKEVDWIAAEDTRNTGLLLKHFDISTKQZSFHEHNAKEKIPDU 
GFLKAGQSUQVSDAGLPSISDPGHDLVKAAIEEEIAVVTVPGASAGISALIASGLAPQPHIFYGFLPRKSGQQKQFFGLKK 
DYPETQIFYESPHRVADTLENMLEVYGDRSVVLVRELTOYEEYQRGTISELLESIAETPLKGECLLIVEGASQGVEEKDE 
EDLFVEIQTRiQQGVKKNQAlKEVAKIYQWKKSQLYAAYHDWEEKQZ 

TPn8 345bD 



ATGATAAAGAAAGGAAAGGGCTGTTTTATGGACAAAAAAGAATTATTTGACGCGCTGGATGATTT^ 
TTATTGGTAACCTTAGCCGATGTGGAAGCCATCAAGAAAAATCTCAAGAGCCTGGTAGAGGAAAATACAGCTCrr 
30 CGCTTGGAAAATAGTAAGrrGCGAGAACGCTTGGGTGAGGTGGAAGCAGATGCTCCTGTCAAGGCCAAGCATGTr 
CGCGAAAGTGTCCGTCGTATTTACCGTGATGGATTTCACGTATGTAATGATTTTTATGGACAACGTCGAGAGCAGG 
ACGAAGAATGTATCTTTTGTGACGAGTTGTTATACAGGGAGTAA 

MIKKGKGCFMDKKELFDALDDFSQQLLVTLADVEAlKKNLKSLVEENTALRLENSiaRERLGEVEADAPVICAKHVRES 
35 VRRIYRDGFHVCNDFYGQRREQDEECMFCDELLYREZ 

IPn9 639bT> 

ATGTCAAAAGGATTTTTAGTCTCn'CTTGAGGGACCAGAGGGAGCAGGCAA 
40 CCAATTTTAGAGGAA^AAGGAGTAGAGGTGTTGACGACCCGTGAACCTGGCGGAGTCTTGATTGGGGAGAAGATT 
CGGGAAGTGATTTTGGATCCAAGTCATACTCAGATGGATGCTAAAACAGAGCTACTTCTCTATATTGCCAGTCGCA 
GACAGCATTTGGTGGAAAAAGTTCTTCCAGCCCTTGAAGCrGGCAAGTTGGTCATCATGGATCGrrrTATCGATAG 
TTCTGTTGCCTATCAGGGATTTGGTCGTGGCTTAGATATTGAAGCCATTGACTGGCT 

GGCCTCAAACCCGATTTGACACTCTATTTTGACATCGAGGTGGAAGAAGGGCTGGCTCGTATTGCTGCTAATAGTG 
45 ACCGCGAGGTTAATCGTTTGGATTTGGAAGGGTTGGACTTGCATAAAAAAGTTCGTCAAGGCTACCTTTCT 

GGATAAAGAGGGAAATCGCATTGTCAAGATTGATGCTAGTCTCCCTTTGGAGCAAGTTGTGGAAACTACCAAGGC 
TGTCTTGTTTGACGGAATGGGCTTGGCCAAATGA 

MSKGFLVSLEGPEGAGKTSVLEALLPILEEKGVEVLTTREPGGVLIGEKIREVILDPSHTQMDAKTELLLYIASRRQHLVE 
50 KVLPALEAGKLVIMDRFIDSSVAYQGFGRGLDIEAIDWLNQFATDGLKPDLTLYFDI£VEEGLARlAANSDREVNjaJ>L 
EGLDLHKKVRQGYLSLLDKEGNRIVK1DASLPLEQVVETTKAVLFDGMGLAK2 

ID120 408bD 

55 ATGGTAGAACAAAGAAAATCAATTACCATGAAAGATGTTGCTTTAGAAGCAGGAGTTAGTGTTGGAACTGTTTCA 
CGTGTAATTAATAAAGAAAAAGGCATTAAAGAAGTAACTTTGAAAAAAGTGGAACAAGCGATTAAAACTTTGA^^ 
TACATTCCAGATTACTACGCTAGAGGAATGAAAAAAAATCGAACAGAAACGATTGCAATCATTGTACCAAGTATC 
TGGCATCCCTTCTTTTCAGAATTTGCTATGCATGTGGAAAATGAAGTCTATAAGAGAAATAACAA^ 
GTTCTATCAATGGTACAAATAGAGAGCAAGACTATCTGGAGATGTTGCGTCATAATAAAGTTGATGGAGTGGTTG 

60 CCATTACCTATAGGCCAATrGAACATTACTTGACGTCAGGAATTCCCTTTGrrAGTATTGACCGCACATACTCAGA 
GATTGCCATTCCTTGTGTTTCA 
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MVEQRKSrrMKDVALEAGVSVGWSRVIhrKEKGmEVTLKKVEQAlKTLOTIPDYYARGMKKNRTETIAIIVPSIWHPFF 
SEFAMHVENEVYKRNNKLLLCSINCTNREQDYLEMLRHNKVDGVVAITYRPIEHYLTSGIPFVSIDRTYSEIAIPCVS 
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tDI21 285bo 

ATGAATATATTTAGAACAAAGAATGTTAGTn"AGATAAAACAGAGATGCATAGGCATTTGAAGTrATGGGATTTG 
ATTTTGCTGGGTATCGGAGCCATGGTAGGGACAGGCGTCTTTACAATCACAGGTACTGCAGCTGCAACACTT 
5 GCCCAGCCCTAGTGATTTCAATCGTTATTTCTGCCrrGTGTGTGGGATTATCAGCCCTCrrr^ 

TCGCGAGTACCCGCTACAGGAGGTGCCTATAGTTACCTCrATGCTATCTTAGGAGAATTCCCTGCCTGGTrGGCrG 
GrrGGTTAACCATGATGGAGTTCATGACAGCCATATCAGGCGTAGCTTCGGGTTGGGCAGCTTATrTTAA 

MNTFRTKWSLDKTEMHRHLKLWDLlU.GIGAMVGTGVFTn'GTAAATLAGPALVISrVISALCVG 
10 ATGGAYSYLYAILGEFPAWLAGWLTMMEFMTAISGVASGWAAYF 

1D124 nilbo 

ATGAAATCAAGAGTAAAGGAAACGACTATGGATAAAATTGTGGTTCAAGGTGGCGATAATCGTCTGGTAGGAAGC 

15 GTGACGATCGAGGGAGCA AAAA ATGCAGTCT TACC CrrGTTGGCAGCGACTATTCTAGCAAGTGAAGGAAAGACC 
GTCTTGCAGAATGTTCCGATTTTGTCGGATGTCTTTATTATGAATCAGGTAGTTGGTGGTTTGAATC 
ACTTTGATGAGGAAGCTCATCTTGTCAAGGTGGATGCTACTGGCGACATCACTGAGGAAGCCCCTTACAAGT^ 
TCAGCAAGATGCGCGCCTCCATCCTTGTATTAGGGCCAATCCTTGCCCGTGTGGGTCATGCCAAGGTATCCATGCC 
AGGTGGTrGTACGATTGGTAGCCGTCCTATTGATCTTCArrrGAAAGGTCTGGAAGCTATGGGGGTTAAGATrAGT 

20 CAGACAGCTGGTTACATCGAAGCCAAGGCAGAACGCrrGCATGGTGCrCATATCTATATGGACTTTCCAAGTGTTG 
GTGCAACGCAGAAGTTGATGATGGCAGCGACTCTGGCTGATGGGGTGACAGTGATTGAGAATGCTGCGCGTGAGC 
CTGAGATTGTTGACTTAGCCATTCTCCTTAATGAAATGGGAGCCAAGGTCAAAGGTGCTGGTACAGAGACTATAA 
CCATTACTGGTGTTGAGAAACTTCATGGTACGACTCACAATGTAGTCCAAGACCGTATCGAJs^GCAGGAACaT^ 
GGTAGCTGCTGCCATGACTGGTGGTGATGTCTTGATTCGAGACGCTGTCTGGGAGCACAACCGTCCCTTGATTGCC 

25 AAGrTACrrGAAATGGGTGTTGAAGTAATTGAAGAAGACGAAGGAArrCGTGTTCGTTCrCAACTAGAAAATCT 
AAAGCTGTTCATGTGAAAACCTTGCCCCACCCAGGATTTCCAACAGATATGCAGGCrCAATTTACAG 
CAGTTGCAAAAGGCGAATCAACCATGGTGGAGACAGTrrrCGAAAATCGTTTCCAAACCTAGAAGAGATGCGCCG 
CATG GGCTTGCATTCTGAGATTATCCGTGATAC AGCTC GTATrGTTGGTGGACAGCCTTTGCAGGGAGCAGAAGTT 
CTTTCAACTGACCTTCGTGCCAGTGCGGCCrrGATrrTGACAGGTTrGGTAGCACAGGGAGA^ 

30 AATTGGTTCACTTGGATAGAGGrrACTACGGrrrcCATGAGAACrrGGCGCAGCTAGGTGCn'AAGATTCAG 
TGAGGCAAGTGATGAAGATGAATAA 

MKSRVKETSMDKrWQGGDNIU,VGSVTIEGAKNAVLPLLAATlLASEGKTVLQNVPILSDVFIMNQVVGGLNAKVDF^ 
EEAHLVKVDATGDrTEEAPYKYVSKMRASIVVLGPILARVGHAKVSMPGGCTIGSRPIDLHLKGLEAMGVfCISQTAGYIE 
35 AKAERLHGAHIYMDFPSVGATQNLMMAATLADGVTVIENAAREPEIVDLAIU^NEMGAKVKGAGTETim 

TTHNVVQDRIEAGTFMVAAAMTGGDVURDAVWEHNRPLUKLLEMGVEVIEEDEGTRVRSQLENUCAVHVKTLPHP 
GFPTDMQAQFTALMTVAKGESTMVETVFENRFQHLEEMRRMGLHSEURDTARIVGGQPLQGAEVLSTDLRASAAUL 
TGtVAQGETVVGKLVHLDRGYYGFHEKLAQLGAKlQRIEASDEDEZ 

40 rP!25 noibp 

ATGrrATTAGCGTCAACAGTAGCCTTGTCATTTGCCCCAGTATTGGCAACTCAAGCAGAAGAAGTrCTTTGGACTG 
CAC GTAG TGTTGAGCAAATCCAAAACGATTTGACTAAAACGGACAACAAAACAAGTTATACCGTACAGTATGGTG 
ATACTTTG AGCA CCATTGCAGAAGCCTTGGGTGTAGATGTCACAGTGCTrGCGAATCTGAACAAAATCACTAATAT 

45 GGACTTGATTTTCCCAGAAACTGTTTTGACAACGACTGTCAATGAAGCAGAAGAAGTAACAGAAGTTGAAATCCA 
AACACCTCAAGCAGACTCTAGTGA AGAAG TGACAACTGCGACAGCAGATTTGACCACTAATCAAGTGACCGTTGA 
TGATCAAACTGTTCAGGTTGCAGACCmCTCAACCAATTGCAGAAGTTACAAAGACAGTGATrGCrrCT^ 
GTGGCACCATCTACGGGCACrrCTGTCCCAGAGCAGCAAACGACCGAAACAACTCGCCCAGTTGCAGAAGAAGCT 
CCTCAGGAAACGACTCCAGCTGAGAAGCAGGAAACACAAACAAGCCCTCAAGCTGCATCAGCAGTGGAAGCAAC 

50 TACAACAAGTTCAGAAGCAAAAGAAGTAGCATCATCAAATGGAGCTACAGCAGCAGTTTCTACTTATCAACCAGA 
AGAAACGAAAGTAATTTCAACAACTTACGAGGCrrCCAGCTGCGCCCGArrATGCTGGACTTGCAGTAGCAA^ 
TGAAAATGCAGGTCTTCAACCACAAACAGCTGCCmAAWGAAGAAATTGCTAACTTGTTTGGCATTACATCC 
AGTGGTTATCGTCCAGGAGACAGTGGAGATCACGGAAAAGGTTTGGCTATCGACTTTATCGTACCAGAACGTTCA 
GAATTAGGGGATAAGATrGCGGAATATGCTATTCAAAATATGGCCAGCCGTGGCATTAGTTACATCATCTGGAAA 

55 CAACGTTTCTATGCTCCATTCGATAGCAAATATGGGCCAGCTAACACTTGGAACCCAATGCCAGACCGTGGTAGT 
GTGACAGAAAATCACTATGATCACGTTCACGTTTCAATGAATGGATAA 

MUJ^STVAI^FAPVLATQAEEVLWTARSVEQIONDLTKTDNKTSYTVQYGDTLSmEALGVDVTVI^ 
IFPETVLTTTVNEAEEVTEVEIQTPQADSSEEVTTATADLTTNQVTVDDQTVQVADI^QPIAEVT^ 
60 VPEEQTTETTRPVAEEAPQErrPAEKQETQTSPOAASAVEATTTSSEAKEVASSNGATAAVSTYQPEETKVXSTTYHAPA 
APDYAGLAVAKSENAGLQPQTAAFKKKLLTCLALHPLVVIVQETVE^'EKVWL5TLWYQ^^VQ^^ZGIR^ 
VALVTSSGNNVSMLHSlAKMGQLTLGTOCQTVVVZQKirMITFTFQZMD 
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TDI26 nSlbp 

5 TTGTTTAAG AAAAATAAAG ACATTCTT AATATTGCATrGCCAGCTATGGGTGAAAA C TT 1 ■ I TGCAGATGCTAATGG 

GAATGGTGGACAGTTA TTTGG TTGO'CATTTAGGATTGATAGCTArrTCAGGGGrrTCAGTAGCT 
CACCATTTATCAGGCGATTTrCATCGCTCTGGGAGCTGCTATTTCCAGTGTTATTTCA 
GACCAGTCGAAGTTGGCCTATCATGTGACTGAGGCGTTGAAGATTACCTTACTATTAAGTTTC CITl ' r AG 
TGTCCATGTTCGCTGGGAAAGAGATGATAGGACri-riGGGGACGGAGAGGGATGTAGCTGAGAGTGGTGGACTGT 
10 ATCrATCTTTGGTAGGCGGATCGArTGTTCTCTTAGGTTTAATGACTAGTCTAGGAGCC^ 

TAATCCACGTCTGCCTCTCTATGrrAGTTTTTTATCCAATGCCTrGAATATTCTTr^ ' 

TCTGGATATGGGGATAGCTGGTGTTGCTTGGGGGACAATTGTGTCTCGTTTGGTrGGTCrTGTGAT^ 

AATTAAAACTGCCrrATGGGAAGCCAACrrTTGGTrrAGATAAGGAACTGTTGACC^ 

AGAGCGA<nTATGATGAGGGCTGGAGATGTAGTGATCATTGCCTTGGTCGlinCi'riTGGGACGGAGGCAGTTGCT 
15 GGGAATGCAATCGGAGAAGTCTTGACCCAGTrrAACTATATGCCTGCCTTTGGCGTCGCTACGGCAACGGTCATC 
CTGTTGGCCCGAGCAGTTGGAGAGGATGATTGGAAAAGAGTTGCTAGTTTGAGTAAACAAACCrm 
TGTTCCTCATGTTGCCCCTGTCCTTTAGTATATATGTCTTGGCTGTACCATTAACTCATCTCTATACGACTGAT^ 
CTAGCGGTGGAGGCTAGTGTTCTAGTGACACTGTTTTCACTACTTGGGACCCCTATGACGACAGGAA^ 
ATACGGCAGTCTGGCAGGGATTAGGAAATGCACGCCTCCCTTTTTATGCGACAAGTATAGGAATGTGGTGTATCC 
20 GCATTGGGACAGGATATCTGATGGGGArrGTGCrrGGTTGGGGCrrGCCTGGTATTTGGGCAGGGTCTCTC^ 
TAATGGTTTTCGCTGGTTATTTCTACGCTATCGTTACCAGCGCTATATGAGCTTGAAAGGATAG 

LFKKNKDILNlALPAMGENFLQMLMGMVDSYLVAHLGLlAtSGVSVAGNirriYQAIHALGAAISSVISKSIGQKDQSKLA 
YKVTEALKnXLLSFLLGFLSIFAGKEMlGLLGTERDVAESGGLYLSLVGGSIVLLGLMTSLGALIRATHNPRLPLyVSFL 
25 SNALNILFSSLAIFVLDMGIAGVAWGTIVSRLVGLVILWSQUCLPYGKPTFGLDKELLT1J\.LPAAGERLMMRAGDWIIA 
LVVSFGTEAVAGNAIGEVLTQFNYMPAFGVATATVMLLARAVGEDDWKRVASLSKQTFWLSLFLMLPLSFSJYVLGVP 
LTHLYTTDSLAVEASVLVTLFSLLGTPMTTGTVIYTAVWQGLGNARLPFYATSIGMWCIRIGTGYLMGIVtGWGl^ 
AGSLLDNGFRWLFLRYRYQRYMSLKGZ 

30 rP127 894bp 

GTGGGA AGAAT TATCAGAGCAGGTGTAAAGATGGAACATCTTGGAAAAGTATTTCGTGAATTTCGAACAAGTGGA 

AATTATTCrrTAAAGGAAG CAGC AGGCGAATCCTGCTCTACCTCTCAGTTATCTCGCTTTGA 

ACCTCGCAGTCTCCCGTrrCTTTGAGATTTTGGATAACATTCATGTAACAATCGAAA 

35 GAATTTTCATAATCATGAACATGTGTCTATGATGGCACAGATTATCCCACnTrACTATTCAAACGATATTCCAGG^ 
TTTCAAAAGCTTCAAAGAGAACAACTTGAAAAGTCTAAGAGTTCGACGACTCCCCTTTATT^ 
TTTrGCTACAAGGTCTGATTTGTCAAAGAGATGCGAGTTATGATATGAAGCAGGATGATTTGGGTAAGGTAGCAG 
ATTATCTCTTCAAAACAGAAGAATGGACCATGTATGAGTTGATTCTTTTCGGTAACCTCrATAGT^ 
AGACTATGTCACTCGGATTGGTAGAGAAGTTATGGAGAGGGAGGAATTTTACCAAGAGATTAGTCGCCATAAGAG 

40 ATTAGTGTTGATTTTGGCCCrCAATTGTTACCAGCATTGTrrAGAGCATTCrrCT^^ 

AGGCTTATACAGAGAAGATTATTGACAAAGGTATTAAGCTTTATGAGCGTAAT GVrri CCATTATTTAAAAGGm 
TGCCTTATATCAAAAAGGACAGTGTAAAGAAGGCTGTAAGCAGATGCAAGAGGCCATGCATATTTTTGATGTGTT 
AGGTCTTCCAGAGCAAGTAGCCTATTATCAGGAACACrACGAAAAATTTGTCAAAAGTTAA 

45 VGRIIRAGVKMEHLGKVFREFRTSGNYSLKEAAGESCSTSQLSRFELGESDLAVSRFFEILDNIHVTIENFMDKARNFHN 
HEHVSMMAQIIPLYYSNDIAGFQKLQREOLEKSKSSTTPLYFELNWILLQGLICQRDASYDMKQDDLGKVADYLFKTEE 
WTMYELILFGNLYSFYDVDYVTRJGREVMEREEFYQEISRHKRLVLILALNCYOHCLEHSSFYNANYFEAYTEKIIDKGl 
KLYERNVFHYLKGFALYQKGQCKEGCKQMQEAMH1FDVLGLPEQVAYYQEHYEKFVKS2 
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TABLE 3 
IDl mSbv 

5 atgtctaacattcaaaacatgtccctggaggacatcatgggagagcgctttggtcgctactccaagtacattat^^ 
aagaccgg gcttt gccagatattcgtgatgggrrgaagccggttcagcgccgtattctttattctatgaataag 

TAGCAATACTTTTGACAAGAGCTACCGTAAGTCGGCCAAGTCAGTCGGGAACATCATGGGGAATTTCCACCCACA 
CGGGGATTCTTCTATCTATGATGCCATGGTTCGTATGTCACAGAACTGGAAAAATCGTGAGATTCrrAGTTGAA^ 
CACGGTAATAACGGTTCTATGGACGGAGATCCTCCTGCGGCTATGCGTTATACTGAGGCACGTTTGTCTGAAATTG 
10 CAGGCTACCTTCTTCAGGATATCGAGAAAAAGACaGTTCCT'ITTGCATGGAACTTTGACGATACGGaGAAAGAAC 
CAACGGTCTTGCCAGCAGCCTTTCCAAACCTCTTGGTCAATGGTTCGACTGGGATTTCGGCTGGTTAT^ 
CATTCCTCCCCATAATTTAGCTGAGGTCATAGATGCTGCAGTTTACATGATTGACCACCCAACTGCAAAGATTG^^ 

aaactcatggaattcttgcctggaccagacttccctacaggggctattattcagggtcgtgatgaaatcaagaaa 
gcttatgagactgggaaagggcgcgtggttgttcgttccaagactgaaattgaaaagctaaaaggtggtaa 
15 caaatcgttattattgagattccttatgaaatcaataaggccaatctagtcaagaaaatcgatgatgttcgtgtta 
ataacaaggtagctgggattgctgaggttcgtgatgagtctgaccgtgatggtcttcgtatcgctatcgaacttaa 
gaaagacgctaatactgagcttgttctcaactacttatttaagtacaccgacctacaaatcaactaca^ 

ATGGTGGCGATTGACAATTTCACACCTCGTCAGGTTGGATTGrrCCAATCCTGTCTAGCTATATCGCTCACCGTCG 
AGAAGTGA 

20 

MSNIQNMSLEDlMGERFGRYSKYIIQDRALPDIRDGLKPVQRiULYSMNKDSOTFDKSYRKSAKSVGNIMGNFHPHGDS 
SrYDAMVRMSQNTWKNREILVEMHGNNGSMDGDPPAAMRYTEARLSElAGYLLQDIEKKTVPFAWNFDDTEKEPTVL 

aafpnllvngstgisagyatdipphnlaevidaavymidhptakidklmeflpgpdfptgaiiqgrdeikkayetgkgrv 
vvrskteieklkggkeqiviieipyeinkanlvkkiddvrvnnkvagiaevrdesdrdglriaielkkdantelvlnylfk 
25 ytdlqinynfnmvaidnftprqvglfqsclalsltveicz 

ID12 6S4bv 

ATGCCGACATTAGAAATAGCACAAAAAAAACTGGAGTTCATTAAGAAGGCAGAAGAATATTACAATGCCTTGTGT 
30 ACAAATATACAGTTGAGCGGAGATAAACTAAAAGTAATTTCCGTTACTTCTGTTAACCCTGGGGAAGGAAAAACA 
ACTACTTCCATAAATATAGCATGGTCGTTTGCGCGTGCAGGCTATAAAACTCTTTTGATCGA 
ATTCAGTTATGTTAGGAGTTTTTAAATCTCGTCAAAAAATrACAGGGCTAACAGAATTT^ 
TTTATCTCACGGTTrATGTGATACAAATATTGAAAATITATTTGTAGTTCAATCGGGATCTGTATC 
ACAGCCTTGTTACAAAGTAAAAATTTTAATGATATGATTGAAACATTGCGTAAATATTT^ 
35 ATACACCGCCTATTGGAATTGTTATTGATGCGGCAATTATCACTCAAAAGTGTGATGCGTCCATCTTGGTAACAGC 
AACAG GTGAG GCGAATAAACGTGATATCCAAAAAGCGAAACAACAATTAAAACAAACAGGGAAACTGTTCCTAG 
GAGTTGTTTTAAATAAATTGGATATCrCGGTTAATAAGTATGGAGTrTACGGTTCCTATGGAAArrATGCT^ 
ATAA 

40 MPTLEIAQKKLEFIKKAEEYYNALCTNIQLSGDKLKVISVTSVNPGEGKTTTSINIAWSFARAGYKTU.IDGDTRNSVM 

GVFKSREKITGLTEFLSGTADtSHGLCDTNIENLFVVQSGSVSPNPTALLQSKNFNDMIETLRKYFDYinDTPPIGlVIDAA 
IITQKCDASILVTATGEANKRDIQKAKQQLKQTGKLFLGVVLNKLDISVNKYGVYGSYGNYGKKZ 

ID13 n82bp 

45 

ATGGAGGCAAATATGAAACATCT^A^ 

TTTTTAGTGGAGCCTTGGGTAGTTTTTCAATAACTCAACTAACTCAAAAAAGTAG 

TAGTACTATTACACAAACTGCCTATAAGAACGAAAATTCAACAACACAGGCTGTTAACAAAGTAAAAGATGCTGT 
TGTrrCTGTTATrACTTATTCGGCAAACAGACAAAATAGCGTATTTGGCAATGATGATACTGACACAGATTCTC^ 

50 CGAATCTCTAGTGAAGGATCTGGAGTTATTTATAAAAAGAATGATAAAGAAGCTTACATCGTCACCAACAATCAC 
GTTA TTAA TGGCGCCAGCAAAGTAGATATTCGATTGTCAGATGGGACTAAAGTACCTGGAGAAATTGTCCGAGCr 
GACACrrTCTCTGATArrGCTGTCGTCAAAATCTCTTCAGAAAAAGTGACAACAGTAGCTGAGTTTGGTGA^ 
GTAAGrrAACTGTAGGAGAAACTGCTATTGCCATCGGTAGCCCGTTAGGTTCTGAATATGCAAATACTGTCACTCA 
AGGTATCGTATCCAGTCTCAATAGAAATGTATCCrrAAAATCGGAAGATGGACAAGCTATTTCTACAAAACCCAT 

55 CCAAACTGATACTGCTATTAACCCAGGTAACTCTGGCGGCCCACTGATCAATATTCAAGGGCAGGTTATCGGAAT 
TACCTCAAGTAAAATTGCTACAAATGGAQGAACATCTGTAGAAGGTCTTGGTrrCGCAATTCCTGCAAATGATGCT 
ATCAATATTATTGAACAGTTAGAAAAAAACGGAAAAGTGACGCGTCCAGCTTTGGGAATCCAGATGGTTAATTTA 
TCTAATGTGAGTACAAGCGACATCAGAAGACTCAATATTCCAAGTAATGTTACATCTGGTGTAATTGTTCGTTCGG 
TACAAAGTAATATGCCTGCCAATGGT CACC TTGAAAAATACGATGTAATTACAAAAGTAGATGACAAAGAGATTG 

60 CTTCATCAACAGACTTACAAAGTGCTCTTTACAACCATTCTATCGGAGACACCATTAAGATAACCT^ 
CGGGAAAGAAGAAACTACCTCTATCAAACTTAACAAGAGTTCAGGTGATTTAGAATCTTAA 

MEANMKHLKTFYKKWFQLLVVmSFFSGALGSFSn'QLTQKSSVNMSNNNSTrrQTAYKNENSTTQAVNKVKD^^^ 
ITYSANRQNSVFGNDDTDTDSQRISSEGSGVIYKKNDKEAYrVTNNHVINGASKVDIRLSDGTKVPGEIVGADTFSDlAV 
65 VKISSEKVTTVAEFGDSSKLTVGETAIAIGSPLGSEYANTVTQGIVSSLNRNVSLKSEDGQAISTKAIQTDTAINPGNSGGP 



65 



LINIQGQVIGrrSSKIATNGGTSVEGLGFAIPANDAlNllEQLEKNGKVTIU^ALGIQMVNLSNVSTSDIRRLNlPSNVTSGVIV 
RSVQSNMPANGHLEKYDVITKVDDKEIASSTDLQSALYNHSIGDTlICrrYYRNGKEETTSIKLNKSSGDLESZ 

^ IDlg 939bp 

ATGGCAGAAATTTATCTAGCAGGTGGTTGI i T i I GGGGCCT AG AGGAAT A 1 1 L ll CACGCATTTCTGGAGTGCTAG 
AAACCAGTGTTGGCTACGCTAATGGTCAAGTCGAAACGACCAATTACCAGTTGCTCAAGGAAACAGACCATGCAG 

aaacggtccaagtgatttacgatgagaaggaagtgtcactcagagagattttactttattatit 
tcctctatctatcaatcaacaagggaatgaccgtggtcgccaatatcgaactgggatttattatcaggatgaagc 
10 agatttgccagctatctacacagtggtgcaggagcaggaacgcatgctgggtcgaaagattgcagtagaagtgga 
gcaattacgccactacattcrggctgaagactaccaccaagactatctcaggaagaatccttcaggttactgtcat 
atcgatgtgaccgatgctgataagccattgattgatgcagcaaactatgaaaagcctagtcaagaggtgttgaag 

GCCAGTCTATCTGAAGAGTCTTATCGTGTCACACAAGAAGCTGCTACAGAGGCTCCATTTACCAATGCCTATGACC 
AAACCTTTGAAGAGGGGATTTATGTAGATATTACGACAGGTGAGCCACTCiM'rj'riGCCAAGGATAAGTTTGCTTC 
15 AGGTTGTGGTTGGCCAAGTTTTAGCCGTCCGATTTCCAAAGAGTTGATTCATTATTACAAGGATCTGAGCCATGGA 
ATGGAGCGAATrGAAGTTCGTTCTCGTTCAGGCAGTGCTCACTTGGGTCATGTTTTCACAGATGGACCGCGGGAG^ 
TAGGCGGCCTCCGrrACTGTATCAATTCTGCTTCTITACGCTTTGTGGCCAAGGATGAGATGGAAAAAGCAGGATA 
TGGCTATCTATTGCCTTACTTAAACAAATAA 

20 MAEIYI^GGCFWGLEEYFSWSGVLETSVGYANGOVETTNYQLliETDHAETVQVIYDEKEVSLRHLLYYFRVIDPLSl 
NQQGNDRGRQYRTGIYYQDEADLPAIYTVVQEQERMLGRKIAVEVEQLRHYILAEDYHQDYLRKNPSGYCHIDVTDA 
DKPLIDAANYEKPSQEVLKASLSEESYRNO-QEAATEAPFTNAYDQTFEEGIYVDITTGEPLFFAKDKFASGCGWPSFSRPI 
SKELIHYYKDl^HGMERIEVRSRSGSAHLGHVFTDGPRELGGLRYCINSASLRFVAKDEMEKAGYGYLLPYLNKZ 

25 mi7 870bp 

ATGAAGATTATTGTACCTGCAACCAGTGCCAATATCGGGCCAGGTTTTGACTCGGTCGGTGTAGCTGTAACCAAGT 
ATCTTCAAj^TTGAGGTCTGCGAAGAACGAGATGAGTGGCTGATTGAACACCAGATTGGCAAATGGATTCCACATG 
ACGAGCGTAATCrCTTGCTCAAAATCGCrrTGCAAATTGTACCAGACrrGCAACCAAGACGCTrGAAAATGACCA 

30 GTGATGTCCCTTTGGCGCGCGGTTTGGGTrCTTCCAGCrCGGTTATCGTTGCTGGGATTGAACTAGCCAACCA^ 

GGGTCAACTCAACTTATCAGACCATGAAAAATTGCAGTTAGCGACCAAGATTGAAGGGCATCCTGACAATGTGGC 
TCCAGCCATTTATGGTAATCTCGTTATTGCAAGTTCTGTTGAAGGGCAAGTCTCTGCTATCGTAGCAGACT^ 
GAGTGTGATTrrCTAGCTTACATTCCAAACTATGAATTACGTACTCGCGACAGCCGTAGTGTCTTGCCTAA^^^ 
TGTCTTATAAGGAAGCTGTTGCTGCAAGTTCTATCGCCAATGTAGCGGTTGCTGCCTTGTTGGCAGGAGACATGGT 

35 GACCGCTGGGCAAGCAATCGAGGGAGACCTCTTCCATGAGCGCTATCGTCAGGACTTGGTAAGAGAATTTGCGAT 
GATTAAGCAAGTGACCAAAGAAAATGGGGCCTATGCAACCTACCTTTCTGGTGCTGGGCCGACAGTTATGGTTCT 
GGCTTCTCATGACAAGATGCCAACAATrAAGGCAGAATTGGAAAAGCAACCrTTCAAAGGAAAACT^ 
GAGAGTTGATACCCAAGGTGTCCGTGTAGAAGCAAAATAA 

40 MKirvPATSANIGPGFDSVGVAVTKYLQIEVCEERDEWUEHQIGKV/IPHDERNLLLKlALQIVPDLQPRRl.KMTSDVPLA 
RGLGSSSSVIVAGIELANQLGQLNLSDHEKLQLATKIEGHPDNVAPAIYGHLVIASSVEGQVSAIVADFPECDFLAYIPNY 
ELRTRDSRSVLPKKLSYKEAVAASSIANVAVAALLAGDMVTAGQAIEGDLFHERYRQDLVREFAMIKQVTKENGAYAT 
YLSGAGPTVMVLASHDKMPTIKAELEKQPFKGKLHDLRVDTQGVRVEAK2 

45 ID20 S64bD 

ATGAAATATCACGATTACATCTGGGATTTAGGTGGAACITTACTGGATAATrATGAAACTTCA^ 
TTGAAACATTGGCACTGTATGGTATCACACAAGACCATGACAGTGTCrATCAAGCTTTAAAGGmCTACTCC^ 
TGCGATTGAGACATTCGCTCCCAATTTAGAGAATT-ril-lAGAAAAGTACAAGGAAAATGAAGCCAGAGAGCTTGA 
50 ACACCCG ATmATTTG AAGG AGTTTCTG ACCTATTGG AAG ACATTTCAAATCAAGGTGGCCGTCA im i ' l GGTC 

TCTCATCGAA ATGA TCAGGTTTTGGAAATTTTAGAAAAAACCTCTATAGCAGCTrATTnACAGAAGTGG^ 
CTAGCrCAGGCTTTAAGAGAAAGCCAAATCCCGAATCCATGCTTTATrTAAGAGAAAAGTATCAGATTAGCTCT^ 
GTCTTGTCATTGGTGATCGGCCGATTGATATCGAAGCAGGTCAAGCTGCAGGACTTGATACCCACTTGTTTACCAG 
TATCGTGAATTTAAGACAAGTATTAGACATATAA 

MKYHDYIWDLGGTLLDNYETSTAAFVETLALYGITQDHDSVYQALKVSTPFAIETFAPNLENFLEKYKENEARELEHPl 
LFEGVSDLLEDISNQGGRHFLVSHRNDQVLEILEKTSIAAYFTEVVTSSSGFKRKPNPESMLYLREKYQISSGLVIGDRPID 
lEAGQAAGLDTHLFTSIVNLRQVLDIZ 

60 TP21 1875bp 

atgacagaagaaatcaaaaatctgcaggcacaggattatgatgccagtcaaattcaagttttagagggcttagag 
gctgttcgtatgcgtccagggatgtacattggatcaacctcaaaagaaggtcttcaccatctagtctgggaaattg 
ttgataactcaattgacgaggccttggcaggatttgccagccatattcaag t ' l 11 1 attgagccagatgattcgat 
65 tactgttgtggatgatgggcgtggtatcccagtcgatattcaggaaaaaacaggccgtcctgctgttgagaccgt 



# 



10 
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CTirACAGTCCn-CACGCTGGAGGAAAGTTCGGCGGTGGTGGATACAAGGTrrCACGTGGTCITCACGGGGTGGC 
GTCGTCAGTAGTTAATGCCCTTTCCACTCAATTAGACGTTCATGTTCACAAAAATGGTAAGATTCATTACCAAGAA 
TACCGTCGTGGTCATCTTGTCGCAGATCTTGAAATAGTTGGAGATACGGATAAAACAGGAACAACTGTTCACTTC 
ACACCGGACCCAAAAATCTTC^CTGAAACAACAATCTTrGATTrrGATAAATrAAATAAACGGATrCAAGAGTTG 
CCCTTTCTAAATCGCGGTCrTCAAATTTCAATrACAGATAAGCGCCAAGGTTTGGAACAAAC^ 
ATGAAGGTGGGATTGCTAGTTACGTTGAATATATCAACGAGAACAAGGATGTAATCrrTGATACACCAATCTATA 

S:^^°;^;^^5H^*^'I^^^^'^°''^ACTGAAAGACAATGAAGATAATTTAAC^^ 

^^,.TtISSIt£^^^°°'^^^°^*^°^^°°CTCGTGTGGCTGCCAAGCGTGCGCGTGAAGTC^ 
1 < AACTCTTCATCCTCGAAGG^ 

^ ^ JCCAATTCCCGGTAAGATTTrGAACGTrGAAAAAGCAAGTATGGATAAGATTCT^^ 

TCTTTrCACAGCCATGGGAACAGGATrrGGCGCAGAATTrGATGTTTCGAAAGCCCGTTACCAAAAACTCGTrTTG 
f^I5^C5°^'r°C«=GATGTCGATGGACCCCACATrCGTACCCTTCITrrAACaTGATrrATC^^^^ 

TATCCAGCCGGGTGCAGATCAAGAAATCAAACrCCAAGAAGCTTTAGCCCGTTATAGTGAAGGTCGTACCAAACC 
2° 2;?5If^^^°C°'^^TAAGGGGCrAGGTGAAATGGACGATCATCAGCTGTCGGAAAC/C^^^ 

TCGCTTGATGGCrAGAGTTTCTGTAGATGATGTGCAGAAGCAGATAAAATCriTGATATGTTGA ^^"^^ 

MTEEDCNLQAQDYDASQIQVLEGLEAVRMRPGMYlGSTSKEGLHHLVWEIVDNSIDEALAGFASHIOVFIEPDDSrrVVD 
DGRGIPVDIQEKTGRPAVETVFTVLHAGGKFGGGGYKVSGGLHGVGSSVVNAI^QLDVHVHKNGKIHYOE^^GIW 
25 VADLEIVGDTDKTGTTVHFTPDPKII^ETTIFDFDKLNKRIQELAFLNRGLQISITDKRQGLEQTTCH^ 

NENKDVlFDTPIYTDGEMDDITVEVAMQYTTGYHENVMSFANNIHTHEGGTHEQGFRTALTRVINDYARKNKlXmN 

RAR£VTRKKSGLEISNLPGKLADCSSNNPAETELFIVEGDSAGGSAKSGRNREFQAIU>IRGKILWEKASMWa^ 
RSLFTAMGTCFGAEFDVSKARYOKLVU^ADVDGAHrRTUXTLIYRYMKPllEAGYvWAQ^^ 
30 QPGADQEIKLQEALARYSEGRTKPnQRYKGLGEMDDHQLWETTMDPEHRLMARVSVDDVQKQIKSUCZ 
IDS4 1446bD 

J:3 TATTGTTAGTTT&liril-IATTGTTCTrAATCmAAGTACAATATCCTTGCTTTrAGATATCTrAATCT^ 
EI°£?™]JS^ACTAG-rTCCCrrGGTAGGGCrACrCTTGArrATCT^^ 

^^^51^'^ 2 *^^°^°*^°^'^^^°'°'*^°°°'^^AATAATGAAAATATrCAGAAATTACTAGCrGATATCAAGT^ 
^f^;^Ji?'^J^^^^*^^'^<^°A■'^'^^'^AAA^ 

T^^I°TI'^9.^°9.'^'^'"GACACCrATGGTCCTAmGTTCGGTGTCGCGATCAGATG^^ 

ATCGAGATACCAAGAAAATCCTCTTGACCACAACGCCACGTGATGCCTATGTACCAATCGCAGATGGTGGAAATA 
^^E:!^'^^:A3!^■^'^'^"■°ACTCATGCGGGCATTTATGGAGTTGATTCGTCCATTCACACCTr^^^ 
^iSJ?i^'^T^'J'^'^™^'^'^°'r°^°ATTGAACTTCACITCGTTT^ 

Tn-/Vr>^TGATCAAGAATTTACTGCCCATACGAATGGAAAGTATTACCCTGCAGGCAATGTTCATOT 
;^?i°°S:FS°°'^°'^^°'^°^°'^°^*<^CCCTAGCAGATGGCGATCGTGACCGCGGGCGC^^^^ 
50 °°I°^I[5I°$™CCITCAAAAATTAACGTCAACCGAAGTGCTG^^ 

^^^^T?I^i^=^'^i^^5^^^^J;^TGcc^ 

'^^™'^'^°^'^"'^CAAGATTTAAAAGGGACAGGTCGGATGGATCTTCCTrCTTATG^^ 
GATC? 

55 MSRRFKKSRSQKVKRSVNIVLLTIYLLLVCFLLFUFKYNII^FRYLNLVVTALVLLVALVGUJJIYKKAE!fFnFLLVF<! 
ILVSSVSLFAVQQFVGLTNRLNATSNYSEYSISVAVLADSBENVTQLTSVTA?rGTNS^>SS^^^ 
SSSYI^AYKSLIAGETKAIVLNSVFENIIESEYPDYASKnCKlYTKGFVKKVEAPK^^^ 
NILMTVNRDTKKa.LTTTPRDAYVPIADGGNNQKDKLTHAGIYGVDSSIffrLENLYGS™ 

fin '°Y;L^'^<5EFrAHTNGKYYPAGm'HU3SEQALGFVRERYSLADGDR^ 

60 dsiqtnmpletminlvnaqlesggnykvnsqducgtgrmdlpsyampdsnlyVmeidS/^vW^ 



40 



45 



IDSS 732bc 
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ATGATAGACATCCATTCGCATATCGTTTTTGATGTAGATGACGGTCCCAAGTCAAGAGAGGAAAGCAAGGCTCTC 
TTGGCAGAATCCTACAGACAGGGGGTGCGAACCATTGTTTCTACCTCTCACCGTCGCAAGGGCATGTTTGAAACTC 
CGGAAGAGAAGATAGCAGAAAACmCTTCAGGTTCGGGAAATAGCTAAGGAAGTGGCGAGTGACTTGGTCATTG 
CTTACGGGGCTGAAArrTATTACACACCAGATGTTCTGGATAAGCTGGAAAAAAAGCGGATTCCGACCCT 
5 ATAGTCGTTATGCCTTGATAGAGTrTAGTATGAACACTCCTTATCGCGATATTCATAGCGCCTTGAGCAAGATCT^ 
GATGTTGGGAATTACTCCAGTCATTGCCCACATTGAGCGCTATGATGCTCTTGAAAATAATGAAAAACGCGTTCGA 
GAACTGATCGATATGGGCTGTTACACGCAAGTAAATAGTTCACATGTCCTCAAACCCAAACriTin'GGCGAACGTT 
ATAAATTCATGAAAAAAAGAGCTCAGTAn i I'l iAGAGCAGGATTTGGTTCATGTCATTGCAAGTGATATGCACAA 
TCTAGACGGTAGACCTCCTCATATGGCAGAAGCATATGACCTTGTTACCCAAAAATACGGAGAAGCGAAGGCTCA 
1 0 GG AACTTTTT ATAG ACAATCCTCG AAAAATTGT AATGG ATC AACT AATTTAG 

MIDIHSHIVFDVDDGPKSR£ESKALLAESyR€?GVRTIVSTSHl^GMF£TPE£KlAENFLQVREIAKEVA5DLVlAYGA^ 
YYTPDVLDKLEKKRIPTLNDSRYALIEFSMNTPVRDIHSALSKILMLCrrPVIAiilERYDALENNEKRVI^LIDMGCYTQV 
NSSHVLKPKLFGERYKFMKKRAQYFLEQDLVHVIASDMHNLDGRPPHMAEAYDLVTQKYGEAKAQELHDNPRKIVM 
15 DQUZ 

!P58 3990bD 

ttgatttatataatcgctatcaatataacaatgcaatcaggagqttttgcaatgaaacatgaaaaacaacagcgt 
20 ttttctattcgtaaatacgctgtaggagcagcrtctgttctaattggattrgcctrccaagc 

ccgatggagttactcctactactacagaaaaccaaccgaccatccatacggtttctgattcccctcaatcatccga 
aaatcggactgaggaaacacctaaagcagtgcrrcaaccagaagctccaaaaacrgtagaaacagaaactccag 
ctacrgataaggtagctagtcttccaaaaacagaagaaaaaccacaagaggaagttagttcaactcctagtgata 
aagcagaagtggtaactccaacttctgctgaaaaagaaactgctaataaajau^ggcagaagaagctagcc^ 
25 aaggaagaagcgaaagaggttgattctaaagagtcaaatacagacaagactgacaaggataaaccagctaaaaa 
agatgaagcgaaagcagaggctgacaaaccggcaacagaggcaggaaaggaacgtgctgcaactgtaaatgaaa 
aactagcgaaaaagaaaattgtttctattgatgctggacgtaaatatttctcaccagaacagctcaaggaaatca 

TCGATAAAGCGAjSACATTATGGCTACACTGATTTACACCTATTAGTCGGAAATGATGGACTCCGTTTCAT^ 
CGATATGAGCATCACAGCTAACGGCAAGACCTATGCCAGTGACGATGTCAAACGCGCCATTGAAAAAGGTACAAA 

30 TGATTATTACAACGATCCAAACGGCAATCACTTAACAGAAAGTCAAATGACAGATCTGATTAACTATGCCAAAGA 
TAAAGGTATCGGTCTCATTCCGACAGTAAATAGTCCTGGACACATGGATGCGATTCTCAATGCCATGAAAGAATr 
GGGAATCCAAAACCCTAACTTTAGCTATTTTGGGAAGAAATCAGCCCGTACTGTCGATCTTGACAACG 
TGTCGCTTITACAAAAGCCCTTATCGACAAGTATGCrGCTTATTrCGCGAAAAAGACTGAAATC^ 
CTTGATGAATATGCCAATGATGCGACAGATGCTAAAGGTTGGAGTGTGCrrCAAGCTGATAAATACTATCCAAAC 

35 GAAGGCTACCCTGTAAAAGGCTATGAAAAATTTATTGCCTACGCCAATGACCTCGCTCGTATTGTAAJW^TC^ 
GGTCTCAAACCAATGGCTITTAACGACGGTATCTACTACAATAGCGACACAAGCTTTGGTAGTTTTGAC 
ATCATCGTTTCTATGTGGACTGGTGGTTGGGGAGGCTACGATGTCGCTTCTrCTAAACTACTAGCTGAA^ 
ACCAAATCCTTAATACCAATGATGCTTGGTACTACGTTCTTGGACGAAACGCTGATGGCCAAGGCTGGTACAATCT 
CGATCAGGGGCTCAATGGTATTAAAAACACACCAATCACTTCTGTACCAAAAACAGAAGGAGCTGATATCCCAAT 

40 CATCGGTGGTATGGTAGCTGCTTGGGCTGACACTCCATCTGCACGTTATTCACCATCACGCCTCTTCAAACTCATG 
CGTCATTTTGCAAj^TGCCAACGCrGAATACrrCGCAGCTGATTATGAATCTGCAGAGCAAGCACTT 
CCAAAAGACCTGAACCGTTATACTGCAGAAAGCGTCACGGCCGTAAAAGAAGCTGAAAAAGCTATTCGCTCTCTC 

gatagcaaccttagccgtgcccaacaagatacgarrgatcaagccattgctaaacttcaagaaactgtcaacaac 
ttgaccctcacgccrgaagctcaaaaagaagaagaagctaaacgtgaggttgaaaaacttgccaaaaacaaggt 
45 aatctcaatcgatgctggacgcaaatactttactctgaaccagctcaaacgcatcgtagacaaggccagtgagct 
cggatattctgatgtccatctccrrctaggaaatgacggacttcgctttctactcgatgatatgaccat^ 

AACGGAAAAACCTATGCTAGTGATGACGTTAAAAAAGCTATTATCGAAGGAACTAAAGCTTACTACGACGATCCA 
AACGGTACTGCACTAACACAGGCAGAAGTAACAGAGCTAATTGAATACGCTAAATCTAAGGACATCGGTCTCATC 
CCAGC TATr AACAGTCCAGGTCACATGGATGCTATGCTGGTTGCCATGGAAAAATTAGGTATTAAAAATCCTCAA 

50 GCCCACTTTGATAAAGTTTC AAAA ACAACTATGGACTTGAAAAACGAAGAAGCGATGAACTrrGTAAAAGCCCT 
ATCGGTAAATACATGGACITCiri'GCAGGTAAAACAAAGATTTTCAACTTTGGTACTGACGAATACGCCAACGAT 
GCGACTAGTGCCCAAGGCTGGTACTACCTCAAGTGGTATCAACTCTATGGCAAATTTGCCGAATATGCCAACACC 
CTCGCAGCTATGGCCAAAGAAAGAGGGCTTCAACCAATGGCCrrCAACGATGGCTTCTACTATGAAGACAAGGAC 
GATGTTCAGTTTGACAAAGATGTCrrGATTrCTTACTGGTCTAAAGGCTGGTGGGGATATAACCT 

55 AATACCTAGCAAGCAAAGGCTATAAATTCTTGAATACCAACGGTGACTGGTACTACATTCTTGGTCAAAAACCAG 
AAGATGGTGGTGGTTTCCrCAAGAAAGCTATTGAGAATACTGGAAAAACACCATTCAATCAACTAGCTTCTACCA 
AATATCCTGAAGT AGATC TTCCAACAGTCGGAAGTATGCTTTCAATCTGGGCAGATAGACCAAGCGCTGAATACA 

aggaagaggaaatctttgaactcatgactgccrrrgcagaccacaacaaagactactttcgtgct 
ctctccgcgaagaattagctaaaattcctacaaacttagaaggatatagtaaagaaagtcttgaggcccttcacg 

60 cagctaaaacagctctaaattacaacctcaaccgtaataaacaagctgagcttgacacgcttgtagccaacctaa 
aagccgctcttcaaggcctcaaaccagctgtaactcattcaggaagcctagatgaaaatgaagtggct 
ttgaaaccacaccagaactcatcacaagaactgaagaaattccatttgaagttatcaagaaagaaaatcctaacc 
tcccagccggtcaggaaaatattatcacagcaggagtcaaaggtgaacgaactcattacatctctgtactcactg 
aaaatggaaaaacaacagaaacagtccttgatagccaggtaaccaaagaagttataaaccaagtggttgaagtt 

65 ggcgctcctgtaactcacaagggtgatgaaagtggtcttgcaccaactactgaggtaaaacctagactggatatc 



® 
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CAAGAAGAAGAAATTCCAmACCACAGTGACTTGTGAAAATCCACTCTTACTCAAAGGAAAAACAC^ 
ACTAAGGGCGTCAATGGACATCGTAGCAACTTCTACTCTGTGAGCACTTCn-GCCGATGGTAAGGAAGTGAA^ 
CTTGTAAATAGTGTCGTAGCACAGGAAGCCGTTACTCAAATAGTCGAAGTCGGAACTATGGTAACACATGTAGGC 
GATGAAAACGGACAAGCCGCTATTGCTGAAGAAAAACCAAAACTAGAAATCCCAAGCCAACCAGCTCCATCAAC 
5 TGCTCCTGCrGAGGAAAGCAAAGTTCrrCCTCAAGATCCAGCTCCTGTGGTAACAGAGAAAAAAC^ 

AGGAACTCACGATTCTGCAGGACTAGTAGTCGCAGGACrCATGTCCACACTAGCAGCCTATGGACTCACTAAAAG 
AAAAGAAGACTAA 

MIYnAINTTMQSGGFAMKHEKQQRFSlRKYAVGAASVUGFAFQAQTVAADGVTPTTTENQimH^ 

10 TPKAVLQPEAPKTVETETPATDKVASLPKTEEKPQEEVSSTPSDKAEVVTPTSAEKETANKKAEEASPKK^ 
S^m)KTDKDKPAKK0EAKAEADKPATEAGKERAATVNEia-AKKKP/SIDAGRKYFSPEQLKE^DKAKHY^ 
VGNDGLRFMLDDMSrrANGKTYASDDVKRAmKGTNDYYNDPNGNHLTESQMTDLIWAKDKGIGUm'NSPGHM 
AlLNAMKELGIQNPNFSYFGKKSARTVDLDNEQAVAFTKAUDKYAAYFAKKTEIFNIGtDEYANDATDAKGWSVLQA 
DKYYPNEGYPVKGYEKFIAYANDLARIVKSHGLKPMAFNDGIYYNSDTSFGSFDKDIIVSMWTGGWGGYDVASSKLLA 

15 EKGHQIL^^^NDAWYYVLGRNADGC^WYNLDQGLNGIK^^ITr^SVPKTEGADIP^GGMVAAWADTPSARYS^^ 

MRHFANANAEYFAADYESAEQALNEVPKDLNRYTAESVTAVKEAEKAlRSLDSNLSRAQQDTIDQAIA^a.QE^VNNLT 
LTPEAOK£EEAKI^VEKXAK^^CVISIDAGRKTFTLNQLKRIVDKASELGYSDVHLlXG^aXJLRFLL^ 
SDDVKKAlIEGTKAYYDDPNGTALTQAEVTEUEYAfCSKDIGLIPAINSPGHMDAMLVAMEKLGIKNPQAHFDKVSKT^ 
MDLKNEEAMNFVKALIGKYMDFFAGKTKIFNFGTOEYANDATSAQGWYYU:wyQLYGKFAEYA^^^LAAMAKERGL 

20 QPMAFNDGFYYEDKDDVQFDKDVLtSYWSKGWWGYNL^PQYLASKGYKFLNTNGDWYYILGQKPEDGGGFLKKAI 
ENTGKTPFNQLASTKYPEVDLPTVGSMLSIWADRPSAEYKEEEIFELMTAFADHNKDYFRANYNALREELAW 
YSKESLEALDAAKTAL^^f^a-NRNKQAELDTLVANLKAALQGI^PAVTHSGSLDE^^EVAANVETRPEL^^RTEEIPre^ 
KKENPNLPAGQENnTAGVKGERTHYISVLTENGKTTEWLDSQVTKEVlNQVVEVGAPVTHKGDESGLAPTTEVKPRL 
DIQEEEIPFITVTCENPLLUCGKTQVrrKGVNGHRSNFYSVSTSADGKEVKTLVNSWAQEAVTQIVEVGTMVTHVGDE 

25 NGQAA1AEEKPKLEIPSQPAPSTAPAEESKVLPQDPAPVVTEKKLPETGTHDSAGLWAGU^STLAAYGLTKR3CEDZ_ 

ID122 825bp 

ATGAACAAAAAAACAAGACAGACACTAATCGGACTGCTAGTGTTATTGCTTTTGTCTACAGGGAG^^ 
30 AAGCAGATGCCGTCGGCACCTAATAGTCCCAAAACCAATCTTAGTCAGAAAAAACAAGCGTCTGAAGCrC^^ 
CAAGCATTGGCAGAGAGTGTCrrAACAGACGCAGTCAAGAGTCAAATAAAGGGGAGTCTGGAGTGGAATGGCT^ 
AGGTGCTTTTATCGTCAATGGTAATAAAACAAATCTAGATGCCAAGGTTTCAAGTAAGCCCTACGCTGACAATAA 
AACAAAGACAGTGGGCAAGGAAACTGTTCCAACCGTAGCTAATGCCCTCTTGTCTAAGGCCACTCCTCAGTACAA 
GAATCGTAAAGAAACrGGGAATGGTTCAACL"l'CrrGGACTCCn"CCAGGTTGGCATCAGGTCAAGAATCTAAA 
35 CTCTTATACCCATGCAGTCGATAGAGGTCATTTGTTAGGCTATGCCrTAATCGGTGGTTTGGATGGTTT^ 

CAACAAGCAATCCTAAAAACATTGCTGTTCAGACAGCCTGGGCAAATCAGGCACAAGCCGAGTATTCGACTGGTC 
AAAACTACTATGAAAGCAAGGTGCGTAAAGCCTTGGACCAAAACAAGCGTGTCCGTTACCGTGTAACCCTTT^ 
ACGCrrCAAACGAGGATTTAGTTCCCTCAGCTTCACAGATTGAAGCCAAGTCTTCGGATGGAGAATTGGA^ 
ATGTTCTAGTTCCCAATGTTCAAAAGGGACTTCAACTGGATTACCGAACrGGAGAAGTAACTGTAACTCAGTAA 

40 

MNKKTRQTLIGLLVLLLLSTGSYYIKQMPSAPNSPKTNLSQKKQASEAPSOAiJ^.ESVLTDAVKSQIKGSLEWNGSGAF^ 
NGNKTNLDAKVSSKPYADNKTKTVGKETVPTVANALI^KATRQYKNRKETGNGSTSWTPPGWHQVKNLKGSYTHAV 
DRGHLLGYALIGGLDGFDASTSNPKNIAVQTAWANQAQAEYSTGQNYYESKVRKALDQNKRVRYRVTLYYASNEDL 
VPSASQIEAICSSDGELEFNVLVPNVQKGLQLDYRTGEVTVTQ2 

45 

ID123 22Sbp 

GTGCTAAGATTCAGCGGATTGAGGCAAGTGATGAAGATGAATAAGAAATCAAGCTACGTAGTCAAGCGTTTACTT 
TTAGTCATCATAGTACTGATTTTAGGTACTCTGGCTCTAGGAATCGGTTTAATGGTAGGTTATGGAATCr^ 
50 AGGGTCAAGATCCATGGGCTATCCTGTCTCCAGCAAAATGGCAGGAATTGATTCATAAATTTACAGGAAATTAG 

VLRFSGLRQVMKMNKKSSYVVKRLLLVirVLILGTLALGlGLMVGYGlLGKGQDPWAILSPAKWQELIHKFTGNZ 



55 



